pjreddie / darknet

Convolutional Neural Networks
http://pjreddie.com/darknet/
Other
25.84k stars 21.33k forks source link

.jpeg files not found in /labels/ folder #658

Open Jakub-Svoboda opened 6 years ago

Jakub-Svoboda commented 6 years ago

I have tried training YOLOv3 on the pascal VOC dataset and the training went fine, so now I am trying to train YOLOv3 on my custom class. I have a dataset in VOC format for which I have generated the labels with the script provided. Now I have all the images in /JPEGImages/ folder and all the generated annotations files in the /labels/ folder. I have created new config file called "yolov3-head.cfg" and I have changed the number of classes in each of the yolo layer to 1 and the number of filters in the layer above each yolo layer to 18. I have set the batch=64 and subdivisions=16. I have also created new cfg/head.data file which looks like this:

classes= 1
train  = /home/kuba/yolo/darknet/dataset/train.txt
valid  = /home/kuba/yolo/darknet/dataset/val.txt
names = data/head.names
backup = backup

and also a new data/head.names which contains a single class name: head.

My train.txt file contains full paths to the individual images:

/home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/JPEGImages/mov_001_007585.jpeg
/home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/JPEGImages/mov_001_007587.jpeg
/home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/JPEGImages/mov_001_007589.jpeg
...

The problem arises when I start the training with:

./darknet detector train cfg/head.data cfg/yolov3-head.cfg darknet53.conv.74

For some reason darknet is looking for the .jpeg files in the /labels/ folder and I receive this error:

Couldn't open file: /home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/labels/mov_001_130285.jpeg
Couldn't open file: /home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/labels/mov_013_117936.jpeg
Couldn't open file: /home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/labels/mov_001_156342.jpeg

I have tried copying every .jpeg file from the /JPEGImages/ to the /labels/ folder, but then the training numbers are all -nan:

Loaded: 0.589299 seconds
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.484583, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.493059, .5R: -nan, .75R: -nan,  count: 0
jeremy-kunzhou commented 6 years ago

Hi, the image path is not the same as the one described in your data file train = /home/kuba/yolo/darknet/dataset/train.txt check the path in your train.txt

Jakub-Svoboda commented 6 years ago

The paths were set correctly. I have actually fixed this error by renaming all the .jpeg files to .jpg

Trojanking123 commented 6 years ago

@Jakub-Svoboda does that work by fixing .jpeg files to jpg? i have the same problems with the same dataset.....

Jakub-Svoboda commented 6 years ago

@FreeKingofNature Yes, it did work for me. I renamed all the files to .jpg and changed the paths in the train.txt file appropriately.

zn845639326 commented 6 years ago

@Jakub-Svoboda Some region like 94,82,106,will show “-nan“. Have you ever been in this situation? Ps: I use the same dataset.....

Region 94 Avg IOU: 0.806114, Class: 0.999890, Obj: 0.996381, No Obj: 0.000783, .5R: 1.000000, .75R: 1.000000,  count: 2
Region 106 Avg IOU: 0.850631, Class: 0.999968, Obj: 0.979259, No Obj: 0.000066, .5R: 1.000000, .75R: 1.000000,  count: 1
Region 82 Avg IOU: 0.835386, Class: 0.999710, Obj: 0.998900, No Obj: 0.011948, .5R: 1.000000, .75R: 0.875000,  count: 8
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000793, .5R: -nan, .75R: -nan,  count: 0
Region 106 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000000, .5R: -nan, .75R: -nan,  count: 0
Region 82 Avg IOU: 0.812951, Class: 0.999933, Obj: 0.999951, No Obj: 0.002752, .5R: 1.000000, .75R: 0.500000,  count: 2
Region 94 Avg IOU: 0.791764, Class: 0.999959, Obj: 0.956886, No Obj: 0.000952, .5R: 1.000000, .75R: 1.000000,  count: 2
Region 106 Avg IOU: 0.746094, Class: 0.999896, Obj: 0.812066, No Obj: 0.000048, .5R: 1.000000, .75R: 0.000000,  count: 1
Region 82 Avg IOU: 0.872028, Class: 0.999869, Obj: 0.999937, No Obj: 0.001653, .5R: 1.000000, .75R: 1.000000,  count: 1
Region 94 Avg IOU: 0.859089, Class: 0.999578, Obj: 0.798197, No Obj: 0.001350, .5R: 1.000000, .75R: 1.000000,  count: 5
Region 106 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000000, .5R: -nan, .75R: -nan,  count: 0
Region 82 Avg IOU: 0.871236, Class: 0.999982, Obj: 0.999019, No Obj: 0.002857, .5R: 1.000000, .75R: 1.000000,  count: 3
Region 94 Avg IOU: 0.838271, Class: 0.998296, Obj: 0.991687, No Obj: 0.000716, .5R: 1.000000, .75R: 1.000000,  count: 1
Region 106 Avg IOU: 0.784796, Class: 0.999965, Obj: 0.995162, No Obj: 0.000059, .5R: 1.000000, .75R: 1.000000,  count: 1
Region 82 Avg IOU: 0.856737, Class: 0.999850, Obj: 0.997983, No Obj: 0.006347, .5R: 1.000000, .75R: 1.000000,  count: 6
Region 94 Avg IOU: 0.636101, Class: 0.999767, Obj: 0.582711, No Obj: 0.000054, .5R: 1.000000, .75R: 0.000000,  count: 1
Region 106 Avg IOU: 0.208812, Class: 0.999890, Obj: 0.430457, No Obj: 0.000025, .5R: 0.000000, .75R: 0.000000,  count: 1
Region 82 Avg IOU: 0.882747, Class: 0.999512, Obj: 0.994097, No Obj: 0.004860, .5R: 1.000000, .75R: 1.000000,  count: 3
Region 94 Avg IOU: 0.843018, Class: 0.999823, Obj: 0.998724, No Obj: 0.001323, .5R: 1.000000, .75R: 1.000000,  count: 3
Region 106 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000000, .5R: -nan, .75R: -nan,  count: 0
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.000014, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: 0.763332, Class: 0.999409, Obj: 0.798982, No Obj: 0.001761, .5R: 1.000000, .75R: 0.714286,  count: 7
pjreddie commented 6 years ago

Nans are fine as long as they only show up when the count is 0. This just means that this batch of images doesn't have any objects that show up at that particular scale (note it often happens in Layer 106, which deals with the smallest objects). Thus when it tries to calculate averages, it divides by 0 and gets nan. Not a problem.

When nans are a problem is when training goes off the rails, but then your whole screen will be full of them.

zn845639326 commented 6 years ago

@pjreddie Thank you for your reply. I got it! Your work is amazing! ^_^

Humphryxin commented 5 years ago

I have tried training YOLOv3 on the pascal VOC dataset and the training went fine, so now I am trying to train YOLOv3 on my custom class. I have a dataset in VOC format for which I have generated the labels with the script provided. Now I have all the images in /JPEGImages/ folder and all the generated annotations files in the /labels/ folder. I have created new config file called "yolov3-head.cfg" and I have changed the number of classes in each of the yolo layer to 1 and the number of filters in the layer above each yolo layer to 18. I have set the batch=64 and subdivisions=16. I have also created new cfg/head.data file which looks like this:

classes= 1
train  = /home/kuba/yolo/darknet/dataset/train.txt
valid  = /home/kuba/yolo/darknet/dataset/val.txt
names = data/head.names
backup = backup

and also a new data/head.names which contains a single class name: head.

My train.txt file contains full paths to the individual images:

/home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/JPEGImages/mov_001_007585.jpeg
/home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/JPEGImages/mov_001_007587.jpeg
/home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/JPEGImages/mov_001_007589.jpeg
...

The problem arises when I start the training with:

./darknet detector train cfg/head.data cfg/yolov3-head.cfg darknet53.conv.74

For some reason darknet is looking for the .jpeg files in the /labels/ folder and I receive this error:

Couldn't open file: /home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/labels/mov_001_130285.jpeg
Couldn't open file: /home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/labels/mov_013_117936.jpeg
Couldn't open file: /home/kuba/yolo/darknet/dataset/VOCdevkit/VOC2007/labels/mov_001_156342.jpeg

I have tried copying every .jpeg file from the /JPEGImages/ to the /labels/ folder, but then the training numbers are all -nan:

Loaded: 0.589299 seconds
Region 82 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.484583, .5R: -nan, .75R: -nan,  count: 0
Region 94 Avg IOU: -nan, Class: -nan, Obj: -nan, No Obj: 0.493059, .5R: -nan, .75R: -nan,  count: 0

It is surely the problem of "JPEG" in "src/data.c" for me. Yolov3 only has the following lines. "find_replace(labelpath, "JPEGImages", "labels", labelpath); find_replace(labelpath, ".jpg", ".txt", labelpath); find_replace(labelpath, ".JPG", ".txt", labelpath); find_replace(labelpath, ".JPEG", ".txt", labelpath);" So for ".jpeg", ".png", ".PNG", it doesn't has the corresponding function call so will generate wrong label name. Adding find_replace(labelpath, ".jpeg", ".txt", labelpath); find_replace(labelpath, ".png", ".txt", labelpath); find_replace(labelpath, ".PNG", ".txt", labelpath); for each place in "src/data.c" solves the problem for me.

kkiillee55 commented 4 years ago

thanks, i use png images and the training was always nan, your post solves my rpoblem