harsh-99 / SCL

Implementation of "SCL: Towards Accurate Domain Adaptive Object Detection via Gradient Detach Based Stacked Complementary Losses"
89 stars 10 forks source link

Using multilple GPUs to accomplish distributed training #30

Open EddieEduardo opened 1 year ago

EddieEduardo commented 1 year ago

Hello! Thanks for sharing the excellent work !!!

When I run the codes using multiple GPUs with nn.parallel.DistributedDataParallel, it'll always raise an error as follows : _RuntimeError: one of the variables needed for gradient computation has been modified by an inplace operation: [torch.cuda.FloatTensor [2]] is at version 3; expected version 2 instead. Hint: enable anomaly detection to find the operation that failed to compute its gradient, with torch.autograd.set_detectanomaly(True).

However, when I run using a single GPU, no errors raise, I am confused...

EddieEduardo commented 1 year ago

SCL运行显示 Hi, when I run the code, the loss of each part goes like this, are they correct ? Thanks for replying in advance.