Upperbound results - Githubissues

jmin0530 commented 1 year ago

Hello!

I have a confusion about how to view Upperbound(Joint) results. First of all, I am running the code by setting the approach to "joint" to see the Upperbound(Joint) result in CIFAR-100 in the script.

There are four types of results for each seed: avg_accs_tag, avg_accs_taw, acc_tag, and acc_taw. In your paper results (Fig 8), it is confusing how to view the Upperbound(Joint) result using the above four results.

Thank you.

mmasana commented 1 year ago

Hi @jmin0530 ,

The joint training is an incremental one, meaning that the network goes through a training session at each task, with access to all data from previous tasks. Basically emulates as if you would be able to store everything into the exemplars memory, thus being an upperbound baseline.

The metrics from the files that contain the word tag mean task-agnostic (the task-ID is not known at test time, class-IL setting). The word taw means task-aware (task-id is known at test time, task-IL setting). Whenever you see the avg it means that it calculates the average, while wavg means weighted average (the average is weighted based on the number of classes in the task). All of the ones you mention are called acc because they provide the accuracy metric (number of correct cases divided by total amount of test samples).

The survey paper mainly covers the scenario for class-IL (task-agnostic). Figure 8 in particular shows the average accuracy of all classes from all tasks learned so far. That means that non-learned classes are not included in the average, since it doesn't make sense for the incremental learning scenario.

Hope that helps!

jmin0530 commented 1 year ago

Thank you for your reply. But I want to understand defenitely. My "joint" approach class-incremental result(avg_accs_tag) at seed 0 is at below.

0.806000 0.687500 0.669333 0.645000 0.645200 0.678000 0.655857 0.664750 0.671556 0.657800

Is this result for upperbound(joint) at seed0 like Fig 8 result??

mmasana commented 1 year ago

The results for CIFAR-100 (10/10) (Figure 8 left) for joint training for the first 3 seeds:

seed 0: 0.788000 0.699500 0.726000 0.737750 0.729400 0.701333 0.694286 0.657750 0.669889 0.670000 seed 1: 0.863000 0.739000 0.700333 0.701250 0.723000 0.675833 0.705429 0.683875 0.679778 0.685100 seed 2: 0.839000 0.755500 0.740333 0.744500 0.741800 0.725333 0.720286 0.705750 0.678111 0.676000

Some comments:

Note that even with the fixed same seed, you might have slightly different results depending on the machine you use.
For Joint, we did not define an exemplar memory of 2,000 samples since, by definition, it has access to all data from all seen tasks so far.
Those results were obtained by using the Continual Hyperparameter Framework (gridsearch options).
What you see in Figure 8 is an average over 10 seeds.
The average standard deviation across all seeds from Joint for this scenario (CIFAR-100 10/10) is around 2.2.

Since you seem to have a bit lower results than the seeds, my guess would be that you did not set up the gridsearch. But if you did, you can provide some more context to figure out where the difference might come from.

jmin0530 commented 1 year ago

My results for CIFAR-100 (10/10) for joint training for the 10 seeds:

seed 0: 0.806000 0.687500 0.669333 0.645000 0.645200 0.678000 0.655857 0.664750 0.671556 0.657800 seed 1: 0.805000 0.766500 0.720000 0.732500 0.733000 0.703000 0.709000 0.685375 0.678889 0.664400 seed 2: 0.744000 0.692000 0.735333 0.697250 0.704600 0.704833 0.688571 0.678625 0.665556 0.665100 seed 3: 0.842000 0.738000 0.735333 0.719750 0.711800 0.709333 0.696857 0.679625 0.684333 0.661500 seed 4: 0.834000 0.651000 0.671333 0.662500 0.676800 0.673000 0.670286 0.642750 0.653222 0.641200 seed 5: 0.767000 0.665500 0.704333 0.724750 0.708800 0.728000 0.713429 0.688625 0.675111 0.657000 seed 6: 0.808000 0.677500 0.731667 0.736750 0.722200 0.711500 0.700429 0.669125 0.676667 0.674500 seed 7: 0.109000 0.548500 0.705000 0.711250 0.714800 0.697500 0.708000 0.685875 0.677889 0.663400 seed 8: 0.858000 0.725500 0.723667 0.720000 0.720400 0.715333 0.718571 0.681125 0.668000 0.652500 seed 9: 0.806000 0.712000 0.745000 0.702750 0.714800 0.699500 0.698714 0.687250 0.652333 0.664800 average: 0.7379 0.6864 0.7140999 0.70525 0.70524 0.7019999 0.6959714 0.6763125 0.6703556 0.66022

My results were obtained by using the Continual Hyperparameter Framework.
My average standard deviation result is 0.846.
My seed 7 task 0 result is wrong. But I can't figure out why this happened. So I will show seed 7 task 0 stdout.
My gpu machine is NVIDIA Geforce 3090, and Cuda version is 11.4
My torch version is 1.11.0+cu113, and torchvision version is 0.12.0+cu113
My experiment argument:

mmasana commented 1 year ago

Looking at the arguments, the difference I see is that you have "num_exemplars" : 2000. As I mentioned, since Joint has already access to all the images, I set up "num_exemplars" : 0. Maybe the difference comes from that. I also have the exemplar sampling set to random, but that has no effect since there is no exemplar memory for Joint.

The error from seed 7 could be anything. It happens rarely sometimes, but I think is just that there is a combination with the initialization or batch order that makes the network reach an unstable point it cannot recover from. As you can see, the loss never really moves much after the first few epochs. I do not have much insight on those cases. What I do is I just run another seed more and ignore this one since it is clearly an unexpected outlier.

Considering the wrong result from seed 7, your average without that outlier seed compares like this:

facil: 80.7, 69.1, 72.0, 70.7, 71.0, 69.5, 69.4, 67.3, 66.5, 66.3 you: 80.8, 70.2, 71.5, 70.5, 70.4, 70.3, 69.5, 67.5, 66.9, 66.0 (no seed 7) you: 73.8, 68.6, 71.4, 70.5, 70.5, 70.2, 69.6, 67.6, 67.0, 66.0 (with seed 7)

which is very similar, considering that the standard deviation is 2.2 for Joint.

arnabphoenix commented 1 year ago

Respected Sir,@jmin0530 Can you please tell me how you are changing the num_exemplars arguments,means it is present in exemplar.py file and when I am changing it there and running the main.py file still it is showing that it is running with default exemplars only.If you could please tell me how to run the main.py file with the modified exemplars arguments.

mmasana / FACIL

Upperbound results #34