Open rishabhfrinks123 opened 1 year ago
To compute the loss, you should set the values of label to 0, not nan. Since you set the values to nan, the loss.backward error occurred.
No i did not assign any labels to nan , as the input goes to inital_conv it gives the output as nan , i can't figure out why?? and Secondly if i manually assign the outputs which are coming nan as 0 and put to cuda again and then compute the loss , than it would not be correct for the optimizers which are working on loss to reduce it . I meant to say that if i manually assign output as 0's than loss would be 0 for negative images from the very first epoch which may not be good for the learning. Please suggest if i am thinking right or not
To be specific--- the mask are labelled as 0 only for the images which do not have any object
I put some images which does not have label in it i.e completely black mask with respect to those images and the output of it is tensors having nan's instead of 0's
outputs----- tensor([[[[nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], ..., ..., [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan], [nan, nan, nan, ..., nan, nan, nan]]]], device='cuda:0',
due to which there is an error coming up in loss.backward()
Variable._execution_engine.run_backward( # Calls into the C++ engine to run the backward pass RuntimeError: CUDA error: device-side assert triggered
My idea was to put some negative images into training so that model understands a bit clearly about the busy background, and as we remove these negative images and corresponding masks the code is working fine.
Please confirm , how to resolve this so that i can consider those negative images as well ???