Other models mentioned in the paper seems invalid in this repository

yushaodong commented 1 year ago

Thanks very much for providing the code of the paper. When I reproduce the repository, I found some code seems useless, such as zeroshot and some base line models in the paper. Is the repository just for MORE, other experiment code shoud be done by myself. Thank you!

k-gyuhak commented 1 year ago

Thanks for your interest in our work and sorry for the confusion. The uploaded code only works for our method MORE. There are some irrelevant lines since the code originally implemented some of the baselines as well.

yushaodong commented 1 year ago

Thanks for your interest in our work and sorry for the confusion. The uploaded code only works for our method MORE. There are some irrelevant lines since the code originally implemented some of the baselines as well.

Thanks for your replay. I spent lots of time and tried to understand the code by myself. But there are still some questions confused me. I trained the network with the commend line in readme.md. 1) "train the network" means train the feature extractor "deitadapter" and classifier at the same time. "train the classifier" means just train the old classifier with new data (ood). 2) I didn't find old data was trained as ood data, or just was used to train classifier and train the extractor with the current task data.
3) In the continual learning setting, the number classes should be unkown. The feature extractor is initialed with fixed number classes. "args.net = transformer(pretrained=True, num_classes=num_classes, latent=args.adapter_latent, args=args).to(device)". 4) While training the feature extractor at each task, the task id is provided to the network. Every block got task id. Is this means every task has specific parameters in every block? As proposed in the paper, when testing the model, the task id is not provided . How to choose the task classifier? Thank you very much！

hailuu684 commented 2 months ago

Thanks for your interest in our work and sorry for the confusion. The uploaded code only works for our method MORE. There are some irrelevant lines since the code originally implemented some of the baselines as well.

Thanks for your replay. I spent lots of time and tried to understand the code by myself. But there are still some questions confused me. I trained the network with the commend line in readme.md.

"train the network" means train the feature extractor "deitadapter" and classifier at the same time. "train the classifier" means just train the old classifier with new data (ood).

I didn't find old data was trained as ood data, or just was used to train classifier and train the extractor with the current task data.

In the continual learning setting, the number classes should be unkown. The feature extractor is initialed with fixed number classes. "args.net = transformer(pretrained=True, num_classes=num_classes, latent=args.adapter_latent, args=args).to(device)".

While training the feature extractor at each task, the task id is provided to the network. Every block got task id. Is this means every task has specific parameters in every block? As proposed in the paper, when testing the model, the task id is not provided . How to choose the task classifier? Thank you very much！

For question 3, I think to simplify the programming. Otherwise, you need to label the dataset again; for example, the first experience has 2 classes, then the labels are 0 and 1. Next, the 2nd experience has 2 new classes, then it is also 0 and 1

k-gyuhak / MORE

Other models mentioned in the paper seems invalid in this repository #1