Closed zuael closed 2 years ago
Hi,
Thank you for your interest in our work; please find my response below:
main.py
, where the implementation of forgetting was consistent with the definition in the paper. The corresponding script for both training and evaluation has been updated to reflect that setting to help reproduce the experimental results.
Hi! We're following your work and think it's a great job, but we may have found two errors in your project.
The first error is: We run your project and found that the
si
method insplits-cifar10
could not achieve theresult
in the paper. Later inspection found that there was a error with the 32nd line ofmodels.si.py
:self.big_omega = self.small_omega / ((self.net.module.backbone.get_params().data - self.checkpoint) ** 2 + self.xi)
Inmodels.si.py
in the project: aimagelab/mammoth you referenced, a line of code that does the same thing is:self.big_omega += self.small_omega / ((self.net.get_params().data - self.checkpoint) ** 2 + self.args.xi)
The difference between these two lines of code is that you are using the equal operation:=
, they are using the accumulation operation:+=
. After we changed this operation, we run thesi
method onsplits-cifar10
again, and the result was improved, but still could not reach the results of the paper.The second error is: In your test file: linear_eval_alltasks.py, the Average Forgetting calculation seems to be different from the paper. The code also calculates the Forgetting value of the last task:
max_knn_acc = [max(idx) for idx in zip(*knn_acc)]
mean_knn_fgt = sum([x1 - x2 for (x1, x2) in zip(max_knn_acc, knn_acc[-1])]) / len(knn_acc[-1])
Different from the formula in the paper:In addition to the above questions, we would like to ask how the optimizer is set up for your experiments with supervised methods. Thanks!