arthurdouillard / dytox

Dynamic Token Expansion with Continual Transformers, accepted at CVPR 2022
https://arxiv.org/abs/2111.11326
Apache License 2.0
134 stars 17 forks source link

The avg accuracy on CIFAR100 50steps #19

Open jmin0530 opened 2 years ago

jmin0530 commented 2 years ago

Hello, Thank you for your code. I used the setting of dytox for 50 steps, but I got a different results from your paper.

I ran cli command below

bash train.sh 0,1 \
    --options options/data/cifar100_2-2.yaml options/data/cifar100_order1.yaml options/model/cifar_dytox.yaml \
    --name dytox \
    --data-path MY_PATH_TO_DATASET \
    --output-basedir PATH_TO_SAVE_CHECKPOINTS \
    --memory-size 1000

According to your paper, your result on CIFAR-100 50 steps is "Avg acc: 64.82, Last acc: 45.61" Here is the three CIFAR-100 orders reproduction result:

Also I will show dytox setting to you

DyTox, for CIFAR100

Model definition

model: convit embed_dim: 384 depth: 6 num_heads: 12 patch_size: 4 input_size: 32 local_up_to_layer: 5 class_attention: true

Training setting

no_amp: true eval_every: 50

Base hyperparameter

weight_decay: 0.000001 batch_size: 128 incremental_batch_size: 256 incremental_lr: 0.0005 rehearsal: icarl_all

Knowledge Distillation

auto_kd: true

Finetuning

finetuning: balanced finetuning_epochs: 20 ft_no_sampling: true

Dytox model

dytox: true freeze_task: [old_task_tokens, old_heads] freeze_ft: [sab]

Divergence head to get diversity

head_div: 0.1 head_div_mode: tr

Independent Classifiers

ind_clf: 1-1 bce_loss: true

Advanced Augmentations, here disabled

Erasing

reprob: 0.0 remode: pixel recount: 1 resplit: false

MixUp & CutMix

mixup: 0.0 cutmix: 0.0

I can't understand why my reproduction results differ from the results you wrote in your paper. Thank you.

arthurdouillard commented 1 year ago

See https://github.com/arthurdouillard/dytox/blob/main/erratum_distributed.md

You probably want to use global memory and 2k memory.

If you use distributed memory with 1k, your effective memory size is rather low (much lower than 2k).