mmasana / FACIL

Framework for Analysis of Class-Incremental Learning with 12 state-of-the-art methods and 3 baselines.
https://arxiv.org/pdf/2010.15277.pdf
MIT License
512 stars 98 forks source link

Error while trying to train on VGGFace2 #10

Closed afonseca18 closed 2 years ago

afonseca18 commented 2 years ago

Hello! I am trying to use LWF on VGGFace2, but I am always getting this error.

"line 110, in get_data assert data[tt]['ncla'] == cpertask[tt], "something went wrong splitting classes" "

Any ideas of how to solve this? Thanks in advance.

mmasana commented 2 years ago

Hi! this error comes most probably from src/datasets/base_dataset.py due to the dataset not being in the correct format. A few options to why it happens come to mind, but I would need more info:

Let me know and hopefully we can figure it out

afonseca18 commented 2 years ago

Hi! Thanks for your response.

The differences in the number of the line are just related to some print instructions that I add for debugging only;

I think I did it correctly, but your message is now making me think about it. I added an entry on "dataset_config.py" just like below, because it is said that "If the dataset is a subset or modification of an already added dataset, only use step 1.". Did I misunderstand it?

The label numbers are correct indeed.

'vggface2': { 'path': join(_BASE_DATA_PATH, 'VGGFace2/data'), 'resize': 256, 'crop': 224, 'flip': True, 'normalize': ((0.5199, 0.4116, 0.3610), (0.2604, 0.2297, 0.2169)) }

maawais commented 2 years ago

Hi! this error comes most probably from src/datasets/base_dataset.py due to the dataset not being in the correct format. A few options to why it happens come to mind, but I would need more info:

  • That assert is in line 106 instead of line 110, so have you made any changes to the original file? If so, which ones? The error might come from there.
  • Did you follow the steps from here to add the VGGFace2 dataset?
  • Are label numbers from your .txt dataset files starting at 1 instead of at 0?
  • Right before the assert which causes the error, what is contained in data[0]['ncla'] and cpertask[0] ?

Let me know and hopefully we can figure it out

I am having the exactly same problem. Here are debugging details for cpertask[tt] and data[tt]['ncla'].

2 2
2 2
2 2
2 2
2 1

One data is missing in the last check. I am checking code myself too at the moment that why data is missing. If something shows up or I find the mistake, I will update later.

Update

Problem: One data class is missing while setting data for different tasks.

Reason: class_order is the reason behind missing data. It starts counting order from 0 instead of 1.

Solution: The missing class problem can be solved by using either of two methods:

  1. Start the data label from 0 (not 1) in train.txt and test.txt.
  2. Change the code as follows in base_dataset.py
    trn_lines = np.loadtxt(os.path.join(path, 'train.txt'), dtype=str)
    tst_lines = np.loadtxt(os.path.join(path, 'test.txt'), dtype=str)
    if class_order is None:
        num_classes = len(np.unique(trn_lines[:, 1]))     
        class_order = list(range(1,num_classes+1))        #1st change is here
    else:
        num_classes = len(class_order)
        class_order = class_order.copy()+1                   #2nd change is here
    #print(class_order)

I am able to run the code with new dataset after changing the code

mmasana commented 2 years ago

Hi both,

what @maawais mentions is correct. The way in which we use it is with labels always starting at 0 instead of 1 for all datasets, and we recommend this solution. With solution 2 you also solve the problem, but might introduce further issues with other datasets or when calling the class_order flag in dataset_config.py.

For @afonseca18, you mention:

I think I did it correctly, but your message is now making me think about it. I added an entry on dataset_config.py just like below, because it is said that "If the dataset is a subset or modification of an already added dataset, only use step 1.". Did I misunderstand it?

And there might be a small misunderstanding. VGGFace2 is not already added yet. The entry in the dataset_config.py file is there for convenience (step 1), but it also needs the train.txt and test.txt files in order to work (step 2, second bulletpoint). Did you also generate those? You need to create them with the path where the data is on your machine.

afonseca18 commented 2 years ago

Hello, @mmasana! Thanks for your reply. In my case, the problem was the labels: they started at 0, but they were not sequential, which was causing the assertion error. I generated both .txt files, and now it is working fine.

Thanks for your replies and your great work!