Question on reproduce the 5-way 5-shot result in Table 4

e96031413 commented 3 years ago

table4

Hello, I try to reproduce the result in Table4 (5-way 1-shot and 5-way 5-shot)

I can achieve expected performance in 5-way 1-shot setting.(in Table4, 5-way 1-shot)

# AGAM not using attributes (5-way 5-shot)
CUDA_VISIBLE_DEVICES=2 python train_wo_attribute.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001

However, when I changed to 5-way 5-shot setting, I found that "the test accuracy of not using attributes is better than full AGAM", which is different from the the Table4.

# AGAM with full attribute (5-way 5-shot)
CUDA_VISIBLE_DEVICES=2 python train.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001

Should I further modify the source code? or I miss something?

bighuang624 commented 3 years ago

@e96031413

Sorry, I just saw this feedback. I don't know how you implemented train_wo_attribute.py, nor do I know the specific results you got, so I don't have an answer at the moment. Can you provide some details to facilitate our inspection?

e96031413 commented 3 years ago

Hi, I implemented train_wo_attribute.py with the following code you provided in previous issue

CUDA_VISIBLE_DEVICES=1 python train_wo_attribute.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001

class ProtoNetAGAMwoAttr(nn.Module):
    def __init__(self, backbone, semantic_size, out_channels):
        super(ProtoNetAGAMwoAttr, self).__init__()
        self.encoder = get_backbone(backbone)

        self.ca_block = CABlock(out_channels)
        self.sa_block = SABlock()

    def forward(self, inputs):

        embeddings = self.encoder(inputs.view(-1, *inputs.shape[2:]))
        ca_embeddings, ca_weights = self.ca_block(embeddings)
        embeddings = ca_embeddings
        sa_embeddings, sa_weights = self.sa_block(embeddings)
        embeddings = sa_embeddings

        return embeddings.view(*inputs.shape[:2], -1)

Also, I did a experiment to compare the performance between 1 attribute and 9 attribute, the result showed that 1 attribute is similar to 9 attribute.

Here I provide the steps I use to produce the experiment result(1 attribute and 9 attribute follows the same steps):

Modify Line 100 in AGAM/global_utils.py, semantic_size_list.append(1)

Modify AGAM/torchmeta/datasets/semantic.py

class CUBClassDataset(ClassDataset):
folder = 'cub'

# Google Drive ID from http://www.vision.caltech.edu/visipedia-data/CUB-200-2011/CUB_200_2011.tgz
gdrive_id = '1hbzc_P1FuxMkcabkgn9ZKinBwW683j45'
tgz_filename = 'CUB_200_2011.tgz'
tgz_md5 = '97eceeb196236b17998738112f37df78'
image_folder = 'CUB_200_2011/images'

filename = '{0}_data.hdf5'
filename_labels = '{0}_labels.json'

assets_dir = 'assets'
text_dir = 'text_c10'
attribute_dir = 'attributes'
# class_attribute_filename_labels = 'class_attribute_labels_continuous.txt'
class_attribute_filename_labels = 'class_attribute_labels_continuous_first_1_attribute.txt' 
image_id_name_filename = 'images.txt'
# image_attribute_filename_labels = 'image_attribute_labels.txt'
image_attribute_filename_labels = 'image_attribute_labels_first_1_attribute.txt'
classes_filename = 'classes.txt'
attributes_dim = 1

I create the 'class_attribute_labels_continuous_first_1_attribute' and 'image_attribute_labels_first_1_attribute.txt' with the original file, select the first 1 attribute in each txt file
Training with the command CUDA_VISIBLE_DEVICES=2 python train.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001

The test acc for 5 shot 1 attribute is 81.18 The test acc for 5 shot 9 attribute is 81.08 The test acc for 5 shot without attribute is 81.7 However, when it comes to 5 shot 312 attribute(original setting in paper), the test acc is 77.96

bighuang624 / AGAM

Question on reproduce the 5-way 5-shot result in Table 4 #4