Tianxiaomo / pytorch-YOLOv4

PyTorch ,ONNX and TensorRT implementation of YOLOv4
Apache License 2.0
4.47k stars 1.49k forks source link

train is good, but can't detect anything (on the same val data) #239

Open s2h22 opened 4 years ago

s2h22 commented 4 years ago

Hello , firstly, i do really appreciate this wonderful work!!

train was finished without any problems,

asd

in this training, i used 15 and 4 samples for training and validation, respectively. but, for one of the 4 validation samples, my custom model detects like below image

and candidate boxes are like below (with confidence_threshold = 0.4) image

i think it's too small to detect objects i don't know why the model is not working although its performance is not bad. i would really appreciate for any helps !

1962975362 commented 4 years ago

你好 能看一下你的数据集格式吗,我的cocoapi调用不起来

s2h22 commented 4 years ago

Can you please take a look at the format of your data set, my cocoapi cannot be called

你好 朋友!! i have original image files and val.txt i used is like below

16.jpg 535,1145,1310,2010,0 1665,783,2378,1812,1 2697,1283,3934,1928,2 17.jpg 189,1029,1202,2218,0 1455,437,2326,1680,1 2567,901,3918,1744,2 18.jpg 3117,695,3966,1680,0 161,537,1208,1922,1 1509,1065,2962,2246,2 19.jpg 2601,1059,3506,2268,0 1713,341,2492,1466,1 993,1343,2484,2644,2

and train.txt has the same format.

is this the thing you want?

1962975362 commented 4 years ago

谢谢,数据集和你一样,但是训练的时候报莫名其妙的错误
File "/home/user/lyf/PycharmProjects/YOLOv4/YOLOv4/dataset.py", line 355, in getitem if (min_w_h / 8)< blur and blur > 1 and any(min_w_h): # disable blur if one of the objects is too small ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

另外cocoapi报错 : keyvalue: no “annotations”

不知道您是否遇到过

s2h22 commented 4 years ago

hmm.. I have not encountered the problem you mentioned

and i think it's likely that your problem is related to your val.txt or train.txt with wrong labels how about rechecking your val.txt or train.txt ?

hopes it helps!

zhouzhubin commented 4 years ago

@s2h22 我和您不一样,我训练出来的ap一直是为0的,请问您训练了多少个epcho,前期的ap的结果就很高吗?

s2h22 commented 4 years ago

@s2h22 I am different from you. The ap I trained is always 0. How many epchos have you trained, is the result of the previous ap high?

I sat 3000 epochs for training and i found the ap score over 0.8 after 1000 epochs

zhouzhubin commented 4 years ago

@s2h22 谢谢您. 请问您现在inference的效果好了吗? 我昨天晚上训练了很长时间, ap不是0了, 但是ap还是很低不到0.1, 实际测试的效果也是很差.漏检和误检挺严重的

s2h22 commented 4 years ago

@s2h22 Thank you. Is the effect of your inference good now? I trained for a long time last night, and the ap is not 0, but the ap is still very low and less than 0.1, and the actual test results are also very poor. Missing and error Serious

Still, i haven't found a solution for my problem. I'm sorry that i don't know exactly what your problem is, but it may be a problem related to labeling. In my case, i also got ap score 0 like you and finally found out that it was due to wrong labeling.

hopes it helps..!

zhouzhubin commented 4 years ago

@s2h22 好吧, 谢谢了啊

wllkk commented 4 years ago

你好,请问你的问题解决了吗?我训练了100个epoch后也是什么也检测不出来。

jayhung97724 commented 4 years ago

@s2h22 some questions about the location of the bounding box:

16.jpg 535,1145,1310,2010,0 1665,783,2378,1812,1 2697,1283,3934,1928,2 17.jpg 189,1029,1202,2218,0 1455,437,2326,1680,1 2567,901,3918,1744,2 18.jpg 3117,695,3966,1680,0 161,537,1208,1922,1 1509,1065,2962,2246,2 19.jpg 2601,1059,3506,2268,0 1713,341,2492,1466,1 993,1343,2484,2644,2

are those that the pixel coordinates of the top-left and bottom-right of the bounding boxes?

however the instructions on AlexeyAB/darknet is like this:

Where: - integer object number from 0 to (classes-1) - float values relative to width and height of image, it can be equal from (0.0 to 1.0] for example: = / or = / attention: - are center of rectangle (are not top-left corner) For example for img1.jpg you will be created img1.txt containing: 1 0.716797 0.395833 0.216406 0.147222 0 0.687109 0.379167 0.255469 0.158333 1 0.420312 0.395833 0.140625 0.166667

so the annotation format is different in the PyTorch version by @Tianxiaomo ?

s2h22 commented 4 years ago

@s2h22 some questions about the location of the bounding box:

16.jpg 535,1145,1310,2010,0 1665,783,2378,1812,1 2697,1283,3934,1928,2 17.jpg 189,1029,1202,2218,0 1455,437,2326,1680,1 2567,901,3918,1744,2 18.jpg 3117,695,3966,1680,0 161,537,1208,1922,1 1509,1065,2962,2246,2 19.jpg 2601,1059,3506,2268,0 1713,341,2492,1466,1 993,1343,2484,2644,2

are those that the pixel coordinates of the top-left and bottom-right of the bounding boxes?

however the instructions on AlexeyAB/darknet is like this:

Where: - integer object number from 0 to (classes-1) - float values relative to width and height of image, it can be equal from (0.0 to 1.0] for example: = / or = / attention: - are center of rectangle (are not top-left corner) For example for img1.jpg you will be created img1.txt containing: 1 0.716797 0.395833 0.216406 0.147222 0 0.687109 0.379167 0.255469 0.158333 1 0.420312 0.395833 0.140625 0.166667

so the annotation format is different in the PyTorch version by @Tianxiaomo ?

  1. are those that the pixel coordinates of the top-left and bottom-right of the bounding boxes? -> yes, you're right. i referred to the image below. asd

below is the original web address https://github.com/Tianxiaomo/pytorch-YOLOv4/blob/master/Use_yolov4_to_train_your_own_data.md

  1. so the annotation format is different in the PyTorch version by @Tianxiaomo ? -> It seems so

thank you :)

s2h22 commented 4 years ago

Hello, has your problem been solved? After I trained for 100 epochs, nothing was detected.

I'm so sorry for the late reply. unfortunately, i haven't solved yet. once i solve it, i'll contact you.

谢谢 :)

EternalEvan commented 3 years ago

val.txt and train.txt Are the two documents the same or they are different?I'm so confused.Can you help me?

s2h22 commented 3 years ago

val.txt and train.txt Are the two documents the same or they are different?I'm so confused.Can you help me?

they have the same format and there is no overlapping item each other. See below

image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path3 x1,y1,x2,y2,id x1,y1,x2,y2,id ... ... image_path10 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path9 x1,y1,x2,y2,id x1,y1,x2,y2,id ... (_WRONG overlapping_) image_path10 x1,y1,x2,y2,id x1,y1,x2,y2,id ... (_WRONG, overlapping_) image_path11 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path12 x1,y1,x2,y2,id x1,y1,x2,y2,id ... ... hopes it helps 👍
EternalEvan commented 3 years ago

Thank you very much and it helped me a lot. So when I train with the pictures of coins like the author, what should I write in val. TXT?

s2h22 commented 3 years ago

Thank you very much and it helped me a lot. So when I train with the pictures of coins like the author, what should I write in val. TXT?

you should write the coordinates of two points (top-left point(x1,y1) and bottom-right(x2,y2)) and the class number(id) starting from 1.

good luck on your job 😎

EternalEvan commented 3 years ago

Thanks, but you said that the content in val.txt can't overlap with train.txt.While the author only provides 25 images in train.txt ,so val.txt should be empty or I need to find more images of coins?

s2h22 commented 3 years ago

Thanks, but you said that the content in val.txt can't overlap with train.txt.While the author only provides 25 images in train.txt ,so val.txt should be empty or I need to find more images of coins?

Of course, making your own data is the best but you know.. it is very inconvenient 😂 you could use the 20 images for training and the rest for validation I guess it would be ok to substitute some data for validation

hope this helps!!

EternalEvan commented 3 years ago

Thank you for your wonderful reply yesterday, which greatly promoted my work. When I run train.py ,the following words raise: convalution havn't activate linear convalution havn't activate linear convalution havn't activate linear in function convert_to_coco_api... You could also create your own 'get_image_id' function. creating index... index created!

Does "convalution havn't activate linear" indicate anything wrong?

lidanyang916 commented 3 years ago

Hi, were you able to solve this issue?

GeorgeTsio commented 6 months ago

val.txt and train.txt Are the two documents the same or they are different?I'm so confused.Can you help me?

they have the same format and there is no overlapping item each other. See below

image_path1 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path2 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path3 x1,y1,x2,y2,id x1,y1,x2,y2,id ... ... image_path10 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path9 x1,y1,x2,y2,id x1,y1,x2,y2,id ... (_WRONG overlapping_) image_path10 x1,y1,x2,y2,id x1,y1,x2,y2,id ... (_WRONG, overlapping_) image_path11 x1,y1,x2,y2,id x1,y1,x2,y2,id ... image_path12 x1,y1,x2,y2,id x1,y1,x2,y2,id ... ... hopes it helps 👍

About that, where should I put these 2 files? And what the -dir parameter will contain?