shivamsaboo17 / Overcoming-Catastrophic-forgetting-in-Neural-Networks

Elastic weight consolidation technique for incremental learning.
124 stars 22 forks source link

Using for datasets with different number of classes #8

Closed appledora closed 5 months ago

appledora commented 5 months ago

Hello, I was following the demo.ipynb notebook. In my case I replaced the backbone with a resnet and instead of MNIST and F-MNIST I am trying to use the stanford Cars (201 classes) and Oxford Birds dataset (196 classes) .

  1. I first train the model on the Cars dataset as following:

    ewc_model = ElasticWeightConsolidation(resnet34(car_classes), criterion)
    # training loop
    ewc_model.register_ewc_params(car_train_dataset, 4, 10)
  2. The I replaced the final linear layer of the ewc_model, with the appropriate number of classes for the Birds dataset:

    ewc_model.model.fc = nn.Linear(512, birds_classes)

However when I tried to start the training loop for Birds, I am getting the following error:

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Cell In[7], line 6
      4 for epoch in range(num_epochs):
      5     for i, (input, target) in enumerate(tqdm.tqdm(birds_train_loader)):
----> 6         ewc_model.forward_backward_update(input, target)
      7     print(f"Epoch {epoch+1} completed.")
      9 ewc_model.register_ewc_params(birds_train_dataset, 4, 10)

Cell In[1], line 61, in ElasticWeightConsolidation.forward_backward_update(self, input, target)
     59 def forward_backward_update(self, input, target):
     60     output = self.model(input)
---> 61     consolidation_loss = self._compute_consolidation_loss(self.weight)
     62     criterion_loss = self.crit(output, target)
     63     loss = criterion_loss + consolidation_loss

Cell In[1], line 54, in ElasticWeightConsolidation._compute_consolidation_loss(self, weight)
     52         estimated_mean = getattr(self.model, '{}_estimated_mean'.format(_buff_param_name))
     53         estimated_fisher = getattr(self.model, '{}_estimated_fisher'.format(_buff_param_name))
---> 54         losses.append((estimated_fisher * (param - estimated_mean) ** 2).sum())
     55     return (weight / 2) * sum(losses)
     56 except AttributeError:

RuntimeError: The size of tensor a (201) must match the size of tensor b (196) at non-singleton dimension 0

Any ideas how I might be able to fix this?