Open psu1 opened 6 years ago
Yes, you need to rewrite the loss function outside the model.
DataParallel
will duplicate your model to run on multiple gpus, so that you can not access a member variable of it.
Hi, I am trying to run the code with multigpu and I have rewrite the loss function outsde the model. The training looks normal, however, when I try to test it, it gives a lot of negative APs, do you have any idea about the reason, Thanks!
when run with torch.nn.DataParallel(net).cuda(), there is "AttributeError: 'DataParallel' object has no attribute 'loss'".
After I change
loss = net.loss
toloss = net.module.loss
, there is a error "TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType" atreturn self.bbox_loss + self.iou_loss + self.cls_loss
Do I need to rewrite the loss function outside "class Darknet19(nn.Module)"?
Any better idea?