Training gives nan - Githubissues

AlexeyAB / darknet

YOLOv4 / Scaled-YOLOv4 / YOLO - Neural Networks for Object Detection (Windows and Linux version of Darknet )

http://pjreddie.com/darknet/

Other

21.72k stars 7.96k forks source link

Training gives nan #825

Open chinmay5 opened 6 years ago

chinmay5 commented 6 years ago

Hi, I am trying to train a YOLO model on the German Traffic Dataset . However, while training I keep getting the values of I have used the converter provided in this link in order to format the annotations. This is a sample annotation that I have for one of the files

As you can see, it matches the format. I tried changing the learning rate as well. However, nothing seems to be helping. Can someone please help me out here

AlexeyAB commented 6 years ago

Width and height can't be negative value. Absolute value of height can't be more than 1. Use this tool to check your dataset: https://github.com/AlexeyAB/Yolo_mark Read: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

chinmay5 commented 6 years ago

My only issue is, I have certain annotations given in the dataset. I used the converter in the link given above. So, should I basically ignore the annotations and start fresh annotations from scratch? Or is there some better scripts available for the conversion. Any help shall be highly appreciated here

AlexeyAB commented 6 years ago

So, should I basically ignore the annotations and start fresh annotations from scratch? Or is there some better scripts available for the conversion.

Yes, this annotations are incorrect. Try to make your own script for for converting these anotations. I don't know scripts for German Traffic Dataset. If you will find or create it, please share it here.

dexception commented 6 years ago

@chinmay5

Kindly check your dataset by using Yolo_Mark application.

chinmay5 commented 6 years ago

Okay I corrected the file and there are no more annotations with a negative value.

However, I still get the same NAN values. Should something be done with the learning rate? Any sort of help is highly appreciated since I am completely stuck in here

AlexeyAB commented 6 years ago

@chinmay5 Did you get Nan in the avg loss field? If no, then training goes well.

chinmay5 commented 6 years ago

This is what I get :(

AlexeyAB commented 6 years ago

On more thing: https://github.com/AlexeyAB/darknet#how-to-train-to-detect-your-custom-objects

You should have 5 values for each line, something like this:

1 0.057813 0.631944 0.101563 0.075000
1 0.165234 0.556250 0.119531 0.070833
1 0.276563 0.506944 0.106250 0.069444
1 0.369922 0.426389 0.097656 0.094444

not 6 as here:

chinmay5 commented 6 years ago

Hi. Thank you for the response. The digit '1' that you see here is actually from my text editor and the actual file is having 5 fields only. First one is the class label (I have 42 in total) while the remaining 4 are the bounding box coordinates

AlexeyAB commented 6 years ago

@chinmay5 Then everything is fine. But your training log looks as if the training dataset doesn't have any objects (has no objects). Can you share your dataset that labeled for Yolo using Google-disk or something like this?

chinmay5 commented 6 years ago

Sure @AlexeyAB . I am attaching the zip file here with the data. The only thing I would like to mention is that the data is in ppm format.

https://drive.google.com/open?id=1xP1W52fAq1-hAQrFoWCvVsKSQVHn3IFn

Would really appreciate if you can give some insight

AlexeyAB commented 6 years ago

@chinmay5 I added fix for ppm-files. Try to update Darknet from this GitHub repository, re-compile and train again. Also I checked your dataset using Yolo_mark - it looks correct.

chinmay5 commented 6 years ago

Hi @AlexeyAB I performed the mentioned steps and I can see logs which look a bit better.

However, for the test run when I tried a prediction on the test image(which was in jpg format), I did not see any bounding boxes getting created. Is there something different that needs to be done here now?

AlexeyAB commented 6 years ago

class label (I have 42 in total)

You should train about 42 x ~2000 = ~84 000 iterations

How many iterations did you train? What avg-loss do you see? And what mAP can you get?

Also what can you get using this command? ./darknet detector calc_anchors data/obj.data -num_of_clusters 9 -width 416 -height 416

chinmay5 commented 6 years ago

@AlexeyAB it has finally started working and after around 4k iterations itself I see some decent results. I will keep you posted about the improvements when I reach around 50K iterations. I would say it looks good at this point of time

AlexeyAB commented 6 years ago

@chinmay5 Can you share the script that you used to get Yolo-labels for German Traffic Dataset?

chinmay5 commented 6 years ago

@AlexeyAB It was actually part of the dataset (Not the script but the annotations). But in case you need it, I can send the two "verysimple" scripts I used for parsing the data and converting the values.

AlexeyAB commented 6 years ago

@chinmay5 Yes, can you compress it and drag-n-drop to your message?

chinmay5 commented 6 years ago

@AlexeyAB Here it is, though I am still confused if the given code would turn out to be very useful :)

annotation-converter.zip