Closed nmd95 closed 3 years ago
Hi @nmd95 thank you for bringing this to my attention. There may be a bug in the error function of this version of the code. I will take a look and get back to you. One thing that you can try in the meantime is reducing the batch size for training.
This seems to only be resolved by reducing the batch-size to 1.
Ok. I will take a look and get back to you on this problem.
From: nmd95 @.> Sent: Friday, April 23, 2021 1:05:09 PM To: mialbro/PointFusion @.> Cc: Robinson, Mark D. @.>; Comment @.> Subject: [EXT] Re: [mialbro/PointFusion] Loss is always "inf" when running training (#1)
This seems to only be resolved by reducing the batch-size to 1.
— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fmialbro%2FPointFusion%2Fissues%2F1%23issuecomment-825793796&data=04%7C01%7Cmdrobinson%40wpi.edu%7C0974db45bf9e4765e8c208d90679f3df%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637547943153543376%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=4Kx85C3%2Bt6uNIQgJvXCO9I8bZlRFi1YXqyQJzAZsIss%3D&reserved=0, or unsubscribehttps://nam03.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2Fnotifications%2Funsubscribe-auth%2FAJT7UGJHBTHC2FYJCZCFNIDTKGSELANCNFSM43FMFEGQ&data=04%7C01%7Cmdrobinson%40wpi.edu%7C0974db45bf9e4765e8c208d90679f3df%7C589c76f5ca1541f9884b55ec15a0672a%7C0%7C0%7C637547943153553368%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000&sdata=c1IGgKDGKULyRkpZgIWgcMzJEgaDdFlebdFz9%2FLkSt4%3D&reserved=0.
Caused by log(0) Fixed by adding epsilon to log in unsupervised loss function loss = ((loss_offset * pred_scores) - (weight * torch.log(pred_scores + eps)))
Hi, I tried to run train.py after cloning the repo + downloading the dataset. As can be seen in the attached screen-shot, the loss remains "inf" even after approx 200 epochs into training.
Is there something to do about it ? what is going on here ?
Any help would be much appreciated !
Cheers,