ihaeyong / WSN

Winning SubNetwork (WSN)
https://proceedings.mlr.press/v162/kang22b.html
MIT License
26 stars 6 forks source link

about the experimental setting for tinyImagenet #2

Open vicmax opened 1 year ago

vicmax commented 1 year ago

Hi,

Thanks for continuing to update the released code!

When I read the paper and the relevant code from this repo, I have several questions about the setting of tinyImagenet experiments:

1. Which setting were the experiments on tiny-ImageNet conducted?

I went through your paper and did not find any descriptions on this point. I printed the model architecture of tiny-Imagenet experiments and found that each classification head has an output of 200. Based on my understanding, shouldn't they be 40 classifiers with output=5 in each classifier?

SubNet(
  (drop1): Dropout(p=0.0, inplace=False)
  (drop2): Dropout(p=0.0, inplace=False)
  (conv1): SubnetConv2d(3, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
  (conv2): SubnetConv2d(160, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
  (conv3): SubnetConv2d(160, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
  (conv4): SubnetConv2d(160, 160, kernel_size=(3, 3), stride=2, padding=(1, 1), bias=False)
  (linear1): SubnetLinear(in_features=2560, out_features=640, bias=False)
  (linear2): SubnetLinear(in_features=640, out_features=640, bias=False)
  (last): ModuleList(
    (0): Linear(in_features=640, out_features=200, bias=False)
    (1): Linear(in_features=640, out_features=200, bias=False)
    (2): Linear(in_features=640, out_features=200, bias=False)
    (3): Linear(in_features=640, out_features=200, bias=False)
    (4): Linear(in_features=640, out_features=200, bias=False)
    (5): Linear(in_features=640, out_features=200, bias=False)
    (6): Linear(in_features=640, out_features=200, bias=False)
    (7): Linear(in_features=640, out_features=200, bias=False)
    (8): Linear(in_features=640, out_features=200, bias=False)
    (9): Linear(in_features=640, out_features=200, bias=False)
    (10): Linear(in_features=640, out_features=200, bias=False)
    (11): Linear(in_features=640, out_features=200, bias=False)
    (12): Linear(in_features=640, out_features=200, bias=False)
    (13): Linear(in_features=640, out_features=200, bias=False)
    (14): Linear(in_features=640, out_features=200, bias=False)
    (15): Linear(in_features=640, out_features=200, bias=False)
    (16): Linear(in_features=640, out_features=200, bias=False)
    (17): Linear(in_features=640, out_features=200, bias=False)
    (18): Linear(in_features=640, out_features=200, bias=False)
    (19): Linear(in_features=640, out_features=200, bias=False)
    (20): Linear(in_features=640, out_features=200, bias=False)
    (21): Linear(in_features=640, out_features=200, bias=False)
    (22): Linear(in_features=640, out_features=200, bias=False)
    (23): Linear(in_features=640, out_features=200, bias=False)
    (24): Linear(in_features=640, out_features=200, bias=False)
    (25): Linear(in_features=640, out_features=200, bias=False)
    (26): Linear(in_features=640, out_features=200, bias=False)
    (27): Linear(in_features=640, out_features=200, bias=False)
    (28): Linear(in_features=640, out_features=200, bias=False)
    (29): Linear(in_features=640, out_features=200, bias=False)
    (30): Linear(in_features=640, out_features=200, bias=False)
    (31): Linear(in_features=640, out_features=200, bias=False)
    (32): Linear(in_features=640, out_features=200, bias=False)
    (33): Linear(in_features=640, out_features=200, bias=False)
    (34): Linear(in_features=640, out_features=200, bias=False)
    (35): Linear(in_features=640, out_features=200, bias=False)
    (36): Linear(in_features=640, out_features=200, bias=False)
    (37): Linear(in_features=640, out_features=200, bias=False)
    (38): Linear(in_features=640, out_features=200, bias=False)
    (39): Linear(in_features=640, out_features=200, bias=False)
  )
)

2. Class incremental loader for tiny-Imagenet I saw that the data loader of tiny-Imagenet was built with loader_type='class_incremental_loader'. Even though with "class incremental setting", shouldn't the 40 classifiers have output like this (0) output=5; (1) output=10; (2) output=15;...; (39) output=200?

Sorry, I am a freshman in the field of Continual Learning. Looking forward to getting any replies.

Best,

ihaeyong commented 1 year ago

Please just follow the following script to train WSN on the tinyImageNet dataset. ./scripts/wsn/wsn_tiny_image.sh I would clean the source code more in near future.

Bests,

vicmax commented 1 year ago

@ihaeyong

Hi Haeyong,

I carefully read the code of FS-DGPM, which also conducted experiments on tiny-ImageNet. In their code, I found they just used one classifier with output=200. However, to take the task-aware evaluation, they applied array slicing for the prediction within a certain task (betweenoffset1 and offset2, e.g., offset1=5 and offset2=10 for the second task) and refill the outputs from other tasks with zero (see the implementation of model(x,t) from the code here). By doing this, FS-DGPM is equivalent to have 40 of 5-way classifiers.

Thus, I guess there is a small mistake in your code as you set 40 of 200-way classifiers and did not apply any array slicing. And I wonder if the prediction performance of WSN will be further improved by constraining the prediction within 5 logits?