why FeTrIL acc is so bad

muyuuuu commented 1 year ago

config:

{
    "prefix": "train",
    "dataset": "cifar100",
    "memory_size": 0,
    "shuffle": true,
    "init_cls": 50,
    "increment": 10,
    "model_name": "fetril",
    "convnet_type": "resnet32",
    "device": ["0"],
    "seed": [1993],
    "init_epochs": 200,
    "init_lr" : 0.1,
    "init_weight_decay" : 0,
    "epochs" : 50,
    "lr" : 0.05,
    "batch_size" : 128,
    "weight_decay" : 0,
    "num_workers" : 8,
    "T" : 2
}

final result:

2022-12-16 23:22:29,436 [fetril.py] => svm train: acc: 10.01
2022-12-16 23:22:29,451 [fetril.py] => svm evaluation: acc_list: [15.2, 12.47, 10.84, 9.78, 9.06, 8.13]
2022-12-16 23:22:31,440 [trainer.py] => No NME accuracy.
2022-12-16 23:22:31,441 [trainer.py] => CNN: {'total': 13.63, '00-09': 19.1, '10-19': 18.3, '20-29': 22.5, '30-39': 17.3, '40-49': 30.1, '50-59': 3.5, '60-69': 11.4, '70-79': 3.1, '80-89': 6.1, '90-99': 4.9, 'old': 14.6, 'new': 4.9}
2022-12-16 23:22:31,441 [trainer.py] => CNN top1 curve: [28.94, 21.87, 18.54, 16.76, 15.1, 13.63]
2022-12-16 23:22:31,441 [trainer.py] => CNN top5 curve: [53.28, 47.07, 43.17, 39.08, 36.41, 33.98]

I can provide log if you need.

caoshuai888 commented 1 year ago

这是来自QQ邮箱的假期自动回复邮件。您好，我最近正在休假中，无法亲自回复您的邮件。我将在假期结束后，尽快给您回复。

G-U-N commented 1 year ago

It seems that you failed at the base training stage, did you wrongly modify the code for base training?

muyuuuu commented 1 year ago

It seems that you failed at the base training stage, did you wrongly modify the code for base training?

yes, In line 104: https://github.com/G-U-N/PyCIL/blob/master/models/fetril.py#L104

I removed this line's comment, otherwise an error will be reported

G-U-N commented 1 year ago

your training log?

muyuuuu commented 1 year ago

your training log?

sorry, it lost. please wait 10 minutes

muyuuuu commented 1 year ago

sorry

final result:

2022-12-19 13:51:09,278 [trainer.py] => CNN top1 curve: [80.42, 71.2, 66.36, 62.06, 59.57, 56.16]
2022-12-19 13:51:09,278 [trainer.py] => CNN top5 curve: [96.72, 91.63, 89.34, 87.46, 85.46, 83.33]

I had set all BN layers : track_running_stats=False, so acc is bad. set track_running_stats=True will got good performance.

so, why does this happen, in order to compare my algorithm fairly, I should set track_running_stats=False for all BN Layers.

G-U-N commented 1 year ago

Glad to see you have reproduced the ideal results in our framework.

For your question:

Setting track_runing_stats to False will cause the running_mean and running_var being frozen, which typically makes the optimization in CNNs much harder especially when training from scratch. You'd better learn more about how BN works in neural networks.

muyuuuu commented 1 year ago

wait.....

look this repo(task increment learning): https://github.com/sahagobinda/GPM/blob/main/main_cifar100.py

all of BN layers are set track_runing_stats to false, but got good performance.......

maybe this is another question?

G-U-N commented 1 year ago

I have no responsibility to answer questions outside our framework, thanks.

muyuuuu commented 1 year ago

I'm sorry I was negligent

G-U-N / PyCIL

why FeTrIL acc is so bad #22