Open sparshgarg23 opened 7 months ago
So a bit of an update. 1.I tried following the changes mentioned in this link https://github.com/thisisi3/Paddle-YOLOF/issues/1#issuecomment-1115545926. However that doesn't change the problem of NAN.
Please ensure that YOLOF is working correctly .As mentioned in earlier comments,the evaluation results even after decreasing the learning rate are coming out to be 0.
Even though the model's loss is decreasing,when I evaluate the model on test images there is no result or bounding box being drawn.Instead I am getting the error that evaluation couldn't be done because the entire test directory is empty.
Me too, I thought it was a problem with my dataset, but it's not. Adjusting to the parameters is difficult
I am training the yolof model and I noticed it's mentioned in the readme that there are some instabilities in the current model,which results in NAN and loss fluctuation issues.
As such ,wanted to know as to why is the NAN issue occuring here.I had earlier trained SOLOv2 and FCOS and didn't notice any NAN issues in those models.
Is it because of the following 1.Parameters being chosen for the model training are resulting in the gradient exploding. 2.MMengine/MMCV recent updates 3.The nans are because of how the model is implemented and should be expected with all visual transformers models.