sgrvinod / a-PyTorch-Tutorial-to-Object-Detection

SSD: Single Shot MultiBox Detector | a PyTorch Tutorial to Object Detection
MIT License
3.04k stars 718 forks source link

Loss function value is nan #53

Closed anesh-ml closed 4 years ago

anesh-ml commented 4 years ago

The multi box loss value is showing more than 2000 initially and loss value is nan. Why is it so? @sgrvinod

toomy0toons commented 4 years ago

try reducing batch size and gradient clipping as sgrvinod noted

im using batch size 8 and gradient clipping value 1

anesh-ml commented 4 years ago

@toomy0toons Thanks for the response. I found that log of negative decimal in the cxcy_to_gcxgcy function gives nan values.. I have to make them absolute to give proper values. How come you did not get any nan values?. If the input to the log function is negative decimal it gives nan.

toomy0toons commented 4 years ago

@Eloneinstein center cords should be positive as it measures distance in ratio so the function should always return positive. are you using a custom dataset? if so match the input to xmin,ymin,xmax,ymax. Same error happened to me when i was feeding different dataset which is was xmax,ymin, xmin, ymax

anesh-ml commented 4 years ago

@toomy0toons . I am using the same VOC 2007 dataset. I will check for the order of the coordinates. What is the loss value you were getting?

anesh-ml commented 4 years ago

@toomy0toons . Thanks a lot for the solution. I put the coordinates in the wrong order. Thanks again.