About reproduced results

7Anonym commented 1 year ago

Thanks for your great job in reproducing these works with Pytorch!

I have a question about the reproduced results. I am running the code with suggested scripts with a single GPU in readme:

python -m torch.distributed.launch \ --nproc_per_node=1 \ --use_env main.py \ cifar100_dualprompt \ --model vit_base_patch16_224 \ --batch-size 16 \ --data-path /local_datasets/ \ --output_dir ./output

this shall undergo the training process of CIFAR100, with a single GPU. The full log is reported here dualprompt.txt The final performance is listed as:

` Averaged stats: Lr: 0.001875 Loss: -0.7786 Acc@1: 93.7500 (93.3000) Acc@5: 100.0000 (99.3200) Test: [Task 1] [ 0/63] eta: 0:00:17 Loss: 0.9931 (0.9931) Acc@1: 75.0000 (75.0000) Acc@5: 93.7500 (93.7500) time: 0.2747 data: 0.1619 max mem: 2112 Test: [Task 1] [30/63] eta: 0:00:03 Loss: 0.6518 (0.5845) Acc@1: 81.2500 (83.6694) Acc@5: 100.0000 (97.5806) time: 0.1127 data: 0.0001 max mem: 2112 Test: [Task 1] [60/63] eta: 0:00:00 Loss: 0.4332 (0.5457) Acc@1: 81.2500 (83.4016) Acc@5: 100.0000 (97.9508) time: 0.1190 data: 0.0001 max mem: 2112 Test: [Task 1] [62/63] eta: 0:00:00 Loss: 0.3826 (0.5352) Acc@1: 87.5000 (83.8000) Acc@5: 100.0000 (98.0000) time: 0.1163 data: 0.0001 max mem: 2112 Test: [Task 1] Total time: 0:00:07 (0.1176 s / it)

Acc@1 83.800 Acc@5 98.000 loss 0.535 Test: [Task 2] [ 0/63] eta: 0:00:16 Loss: 0.8936 (0.8936) Acc@1: 68.7500 (68.7500) Acc@5: 100.0000 (100.0000) time: 0.2681 data: 0.1567 max mem: 2112 Test: [Task 2] [30/63] eta: 0:00:03 Loss: 0.7908 (0.8057) Acc@1: 68.7500 (74.3952) Acc@5: 100.0000 (95.9677) time: 0.1146 data: 0.0007 max mem: 2112 Test: [Task 2] [60/63] eta: 0:00:00 Loss: 0.5982 (0.7659) Acc@1: 75.0000 (76.2295) Acc@5: 100.0000 (96.7213) time: 0.1126 data: 0.0001 max mem: 2112 Test: [Task 2] [62/63] eta: 0:00:00 Loss: 0.5738 (0.7533) Acc@1: 81.2500 (76.5000) Acc@5: 100.0000 (96.8000) time: 0.1098 data: 0.0001 max mem: 2112 Test: [Task 2] Total time: 0:00:07 (0.1156 s / it)
Acc@1 76.500 Acc@5 96.800 loss 0.753 Test: [Task 3] [ 0/63] eta: 0:00:18 Loss: 0.5311 (0.5311) Acc@1: 75.0000 (75.0000) Acc@5: 100.0000 (100.0000) time: 0.3011 data: 0.1866 max mem: 2112 Test: [Task 3] [30/63] eta: 0:00:03 Loss: 0.6168 (0.6531) Acc@1: 81.2500 (79.6371) Acc@5: 100.0000 (97.7823) time: 0.1147 data: 0.0002 max mem: 2112 Test: [Task 3] [60/63] eta: 0:00:00 Loss: 0.5880 (0.6590) Acc@1: 75.0000 (79.9180) Acc@5: 100.0000 (97.5410) time: 0.1131 data: 0.0002 max mem: 2112 Test: [Task 3] [62/63] eta: 0:00:00 Loss: 0.5880 (0.6539) Acc@1: 81.2500 (80.1000) Acc@5: 100.0000 (97.6000) time: 0.1104 data: 0.0002 max mem: 2112 Test: [Task 3] Total time: 0:00:07 (0.1163 s / it)
Acc@1 80.100 Acc@5 97.600 loss 0.654 Test: [Task 4] [ 0/63] eta: 0:00:16 Loss: 1.2372 (1.2372) Acc@1: 62.5000 (62.5000) Acc@5: 81.2500 (81.2500) time: 0.2606 data: 0.1482 max mem: 2112 Test: [Task 4] [30/63] eta: 0:00:03 Loss: 0.7294 (0.8557) Acc@1: 75.0000 (73.9919) Acc@5: 100.0000 (95.3629) time: 0.1152 data: 0.0004 max mem: 2112 Test: [Task 4] [60/63] eta: 0:00:00 Loss: 0.6991 (0.7561) Acc@1: 75.0000 (76.6393) Acc@5: 100.0000 (96.3115) time: 0.1123 data: 0.0001 max mem: 2112 Test: [Task 4] [62/63] eta: 0:00:00 Loss: 0.7169 (0.7534) Acc@1: 75.0000 (76.6000) Acc@5: 100.0000 (96.3000) time: 0.1096 data: 0.0001 max mem: 2112 Test: [Task 4] Total time: 0:00:07 (0.1155 s / it)
Acc@1 76.600 Acc@5 96.300 loss 0.753 Test: [Task 5] [ 0/63] eta: 0:00:16 Loss: 0.3555 (0.3555) Acc@1: 81.2500 (81.2500) Acc@5: 100.0000 (100.0000) time: 0.2636 data: 0.1497 max mem: 2112 Test: [Task 5] [30/63] eta: 0:00:03 Loss: 0.6456 (0.7171) Acc@1: 75.0000 (77.0161) Acc@5: 100.0000 (96.7742) time: 0.1149 data: 0.0002 max mem: 2112 Test: [Task 5] [60/63] eta: 0:00:00 Loss: 0.6275 (0.6793) Acc@1: 75.0000 (77.1516) Acc@5: 100.0000 (97.2336) time: 0.1126 data: 0.0001 max mem: 2112 Test: [Task 5] [62/63] eta: 0:00:00 Loss: 0.6424 (0.6989) Acc@1: 75.0000 (76.7000) Acc@5: 100.0000 (97.1000) time: 0.1099 data: 0.0001 max mem: 2112 Test: [Task 5] Total time: 0:00:07 (0.1157 s / it)
Acc@1 76.700 Acc@5 97.100 loss 0.699 Test: [Task 6] [ 0/63] eta: 0:00:16 Loss: 0.4319 (0.4319) Acc@1: 81.2500 (81.2500) Acc@5: 100.0000 (100.0000) time: 0.2672 data: 0.1561 max mem: 2112 Test: [Task 6] [30/63] eta: 0:00:03 Loss: 1.1000 (0.9934) Acc@1: 62.5000 (67.5403) Acc@5: 93.7500 (95.1613) time: 0.1134 data: 0.0005 max mem: 2112 Test: [Task 6] [60/63] eta: 0:00:00 Loss: 0.9254 (1.0341) Acc@1: 68.7500 (65.8811) Acc@5: 100.0000 (95.4918) time: 0.1147 data: 0.0005 max mem: 2112 Test: [Task 6] [62/63] eta: 0:00:00 Loss: 0.8768 (1.0284) Acc@1: 68.7500 (66.0000) Acc@5: 100.0000 (95.6000) time: 0.1119 data: 0.0005 max mem: 2112 Test: [Task 6] Total time: 0:00:07 (0.1160 s / it)
Acc@1 66.000 Acc@5 95.600 loss 1.028 Test: [Task 7] [ 0/63] eta: 0:00:17 Loss: 1.4151 (1.4151) Acc@1: 62.5000 (62.5000) Acc@5: 87.5000 (87.5000) time: 0.2841 data: 0.1698 max mem: 2112 Test: [Task 7] [30/63] eta: 0:00:03 Loss: 0.7989 (0.9311) Acc@1: 75.0000 (72.1774) Acc@5: 93.7500 (94.1532) time: 0.1148 data: 0.0002 max mem: 2112 Test: [Task 7] [60/63] eta: 0:00:00 Loss: 0.9859 (0.9544) Acc@1: 68.7500 (72.1311) Acc@5: 93.7500 (93.9549) time: 0.1122 data: 0.0001 max mem: 2112 Test: [Task 7] [62/63] eta: 0:00:00 Loss: 1.0050 (0.9652) Acc@1: 68.7500 (72.1000) Acc@5: 93.7500 (93.9000) time: 0.1095 data: 0.0001 max mem: 2112 Test: [Task 7] Total time: 0:00:07 (0.1156 s / it)
Acc@1 72.100 Acc@5 93.900 loss 0.965 Test: [Task 8] [ 0/63] eta: 0:00:16 Loss: 0.7402 (0.7402) Acc@1: 75.0000 (75.0000) Acc@5: 87.5000 (87.5000) time: 0.2654 data: 0.1541 max mem: 2112 Test: [Task 8] [30/63] eta: 0:00:03 Loss: 1.0145 (0.9946) Acc@1: 68.7500 (71.1694) Acc@5: 93.7500 (92.3387) time: 0.1139 data: 0.0002 max mem: 2112 Test: [Task 8] [60/63] eta: 0:00:00 Loss: 0.9531 (0.9975) Acc@1: 68.7500 (70.7992) Acc@5: 93.7500 (92.8279) time: 0.1128 data: 0.0002 max mem: 2112 Test: [Task 8] [62/63] eta: 0:00:00 Loss: 0.8923 (0.9979) Acc@1: 68.7500 (70.9000) Acc@5: 93.7500 (92.8000) time: 0.1101 data: 0.0002 max mem: 2112 Test: [Task 8] Total time: 0:00:07 (0.1153 s / it)
Acc@1 70.900 Acc@5 92.800 loss 0.998 Test: [Task 9] [ 0/63] eta: 0:00:16 Loss: 0.4951 (0.4951) Acc@1: 81.2500 (81.2500) Acc@5: 100.0000 (100.0000) time: 0.2616 data: 0.1483 max mem: 2112 Test: [Task 9] [30/63] eta: 0:00:03 Loss: 0.4687 (0.5289) Acc@1: 81.2500 (83.8710) Acc@5: 100.0000 (98.3871) time: 0.1149 data: 0.0003 max mem: 2112 Test: [Task 9] [60/63] eta: 0:00:00 Loss: 0.4225 (0.5093) Acc@1: 87.5000 (85.5533) Acc@5: 100.0000 (98.3607) time: 0.1125 data: 0.0001 max mem: 2112 Test: [Task 9] [62/63] eta: 0:00:00 Loss: 0.4110 (0.5027) Acc@1: 87.5000 (85.7000) Acc@5: 100.0000 (98.4000) time: 0.1098 data: 0.0001 max mem: 2112 Test: [Task 9] Total time: 0:00:07 (0.1163 s / it)
Acc@1 85.700 Acc@5 98.400 loss 0.503 Test: [Task 10] [ 0/63] eta: 0:00:17 Loss: 0.7950 (0.7950) Acc@1: 68.7500 (68.7500) Acc@5: 100.0000 (100.0000) time: 0.2822 data: 0.1709 max mem: 2112 Test: [Task 10] [30/63] eta: 0:00:03 Loss: 0.7083 (0.6928) Acc@1: 75.0000 (77.8226) Acc@5: 100.0000 (96.1694) time: 0.1128 data: 0.0002 max mem: 2112 Test: [Task 10] [60/63] eta: 0:00:00 Loss: 0.6565 (0.6841) Acc@1: 75.0000 (77.1516) Acc@5: 100.0000 (97.1311) time: 0.1124 data: 0.0001 max mem: 2112 Test: [Task 10] [62/63] eta: 0:00:00 Loss: 0.6565 (0.6851) Acc@1: 75.0000 (77.3000) Acc@5: 100.0000 (97.1000) time: 0.1096 data: 0.0001 max mem: 2112 Test: [Task 10] Total time: 0:00:07 (0.1154 s / it)
Acc@1 77.300 Acc@5 97.100 loss 0.685 [Average accuracy till task10] Acc@1: 76.5700 Acc@5: 96.3600 Loss: 0.7574 Forgetting: 7.9000 Backward: -7.9000 Total training time: 0:53:13`

It seems the final acc@1 is 77.3 instead of the reported 86.51. Did I miss anything? Thanks!

7Anonym commented 1 year ago

A similar problem occurs in the ImageNet-R dataset, where I got (with the suggested scripts)

Acc@1 57.600 Acc@5 77.867 loss 2.098 [Average accuracy till task10] Acc@1: 60.8711 Acc@5: 79.9315 Loss: 1.9111 Forgetting: 4.5296 Backward: -4.3146 Total training time: 1:03:48

which is also less than the reported values.

JH-LEE-KR commented 1 year ago

Hi, thanks for your comment.

It's my fault for writing the README incorrectly.

In the README batch_size was written as 16, but looking at the script file (tranin_cifar100_dualprompt.sh and train_imr_dualmprompt.sh) used for the actual train, it is 24. Confused with L2P.

README will be modified and, Sorry for the confusion once again.

Experiment with batch_size 24 and feel free to comment if there is anything strange.

Best, Jaeho Lee.

7Anonym commented 1 year ago

Hi Jaeho,

Thanks for your timely reply. After changing the batch size, I obtain almost the same performance, i.e., final accuracy as 77.22.

In fact, I found a way to reproduce the results by changing the pre-trained model in Line 41 into vit_base_patch16_224_in21k.

In my opinion, the original L2P and DualPrompt are using the imagenet21k pretrained Vit model, since the link in the original repo is also directed to it (see this issue)

By changing the pre-trained model, I obtain the competitive results to the original paper.

JH-LEE-KR commented 1 year ago

Hi, thanks for the comment.

I found that timm's vit_base_patch16_224_in21k and the model used in the official code are different. It uses the augmented version of vit_base_patch16_224_in2lk url.

So I modified the url of vit_base_patch16_224 to be the same as the model used in the official code. Check out my code here.

I still reproduce well. Please let me know if there is anything I missed and provide detailed information such as detailed environment, GPU information, etc.

Let's discuss about this issue together.

Best, Jaeho Lee.

7Anonym commented 1 year ago

Hi Jaeho,

Thanks for your comment! I noticed your modification in the pretrained model, which is consistent with the original repo.

After the clarification, I am not quite sure about the reason for the performance gap. I will create a same environment to yours for evaluation to check whether the gap is from different torch version, etc.

BTW, could you provide the training log of yours in your environment? I am still curious about the reason for such performance gap. Thanks!

JH-LEE-KR commented 1 year ago

Sorry for the delayed reply. I had busy jobs.

The training log is here. Please check it. dualprompt_result.txt This is the result of randomly giving only seed (--seed=$RANDOM)in train_cifar100_dualprompt.sh

I still don't know why the performance degradation is occurring. More information and experiments are needed.

I will create a same environment to yours Does this mean that the previous experiment was conducted in a different environment than mine? If so, you also need to consider environmental issues.

Please let me know the results after experimenting in the same environment as me.

Best, Jaeho Lee.

zhaoedf commented 1 year ago

Sorry for the delayed reply. I had busy jobs.

The training log is here. Please check it. dualprompt_result.txt This is the result of randomly giving only seed (--seed=$RANDOM)in train_cifar100_dualprompt.sh

I still don't know why the performance degradation is occurring. More information and experiments are needed.

I will create a same environment to yours Does this mean that the previous experiment was conducted in a different environment than mine? If so, you also need to consider environmental issues.

Please let me know the results after experimenting in the same environment as me.

Best, Jaeho Lee.

In your log, you use batch_size=24 and e-prompt length=5. i don't this setting matches to the official code?

JH-LEE-KR / dualprompt-pytorch

About reproduced results #5