bighuang624 / AGAM

Code for the AAAI 2021 paper "Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition".
https://kyonhuang.top/publication/attributes-guided-attention-module
10 stars 6 forks source link

Question on reproduce the 5-way 5-shot result in Table 4 #4

Open e96031413 opened 3 years ago

e96031413 commented 3 years ago

table4

Hello, I try to reproduce the result in Table4 (5-way 1-shot and 5-way 5-shot)

I can achieve expected performance in 5-way 1-shot setting.(in Table4, 5-way 1-shot)

# AGAM not using attributes (5-way 5-shot)
CUDA_VISIBLE_DEVICES=2 python train_wo_attribute.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001

However, when I changed to 5-way 5-shot setting, I found that "the test accuracy of not using attributes is better than full AGAM", which is different from the the Table4.

# AGAM with full attribute (5-way 5-shot)
CUDA_VISIBLE_DEVICES=2 python train.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001

Should I further modify the source code? or I miss something?

bighuang624 commented 3 years ago

@e96031413

Sorry, I just saw this feedback. I don't know how you implemented train_wo_attribute.py, nor do I know the specific results you got, so I don't have an answer at the moment. Can you provide some details to facilitate our inspection?

e96031413 commented 3 years ago

Hi, I implemented train_wo_attribute.py with the following code you provided in previous issue

CUDA_VISIBLE_DEVICES=1 python train_wo_attribute.py --train-data cub --test-data cub --backbone conv4 --num-shots 5 --semantic-type class_attributes --num-workers 8 --batch-tasks 2 --lr 0.001
class ProtoNetAGAMwoAttr(nn.Module):
    def __init__(self, backbone, semantic_size, out_channels):
        super(ProtoNetAGAMwoAttr, self).__init__()
        self.encoder = get_backbone(backbone)

        self.ca_block = CABlock(out_channels)
        self.sa_block = SABlock()

    def forward(self, inputs):

        embeddings = self.encoder(inputs.view(-1, *inputs.shape[2:]))
        ca_embeddings, ca_weights = self.ca_block(embeddings)
        embeddings = ca_embeddings
        sa_embeddings, sa_weights = self.sa_block(embeddings)
        embeddings = sa_embeddings

        return embeddings.view(*inputs.shape[:2], -1)

Also, I did a experiment to compare the performance between 1 attribute and 9 attribute, the result showed that 1 attribute is similar to 9 attribute.

Here I provide the steps I use to produce the experiment result(1 attribute and 9 attribute follows the same steps):

The test acc for 5 shot 1 attribute is 81.18 The test acc for 5 shot 9 attribute is 81.08 The test acc for 5 shot without attribute is 81.7 However, when it comes to 5 shot 312 attribute(original setting in paper), the test acc is 77.96