catalyst-team / catalyst

Accelerated deep learning R&D
https://catalyst-team.com
Apache License 2.0
3.3k stars 388 forks source link

Multi Criterion Training #1435

Closed GirinChutia closed 1 year ago

GirinChutia commented 2 years ago

Error in Multi Criterion Training

<

weights = [0.2,0.3]
class_weights = torch.FloatTensor(weights).to(device) #.cuda()
criterion = {"CE_Loss1": nn.CrossEntropyLoss(weight=class_weights),"CE_Loss2": nn.CrossEntropyLoss()} 
....
....
loss1 = self.criterion["CE_Loss1"](self.batch["logits1"], self.batch["targets1"])
loss2 = self.criterion["CE_Loss2"](self.batch["logits2"], self.batch["targets2"])
loss_ce1ce2 = loss1 + loss2
self.batch_metrics.update({"loss_ce1": loss1, 
                           "loss_ce2": loss2, 
                           "loss_ce1ce2": loss_ce1ce2})

for key in ["loss_ce1", "loss_ce2", "loss_ce1ce2"]:
        self.meters[key].update(self.batch_metrics[key].item(), self.batch_size)

if self.is_train_loader:
    self.engine.backward(loss_ce1ce2) #causing problem
    self.optimizer.step()
    self.optimizer.zero_grad()

Hi, I am trying to train a model using multi-criterion. Part of code for computing the loss is shown above. Doing so I am getting the following error.

_

RuntimeError: Trying to backward through the graph a second time (or directly access saved tensors after they have already been freed). Saved intermediate values of the graph are freed when you call .backward() or autograd.grad(). Specify retain_graph=True if you need to backward through the graph a second time or if you need to access saved tensors after calling backward.

_

Can anyone please check if I am doing the correct way?

github-actions[bot] commented 2 years ago

Hi! Thank you for your contribution! Please re-check all issue template checklists - unfilled issues would be closed automatically. And do not forget to join our slack for collaboration.

GirinChutia commented 2 years ago

I think I have solved it. Mistake I think I did was, in my runner.train(), I have put callbacks=[dl.BackwardCallback(metric_key="loss_ce1ce2") which shouldn't be there if I put,

if self.is_train_loader:
    self.engine.backward(loss_ce1ce2) 
    self.optimizer.step()
    self.optimizer.zero_grad()

in 'handle_batch' method of my CustomRunner class.

Is my understanding correct? Please let me know.

Pupy101 commented 2 years ago

Hi, here is an example of a callback that aggregates several loss functions. I think he should help you

stale[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.