detection good on images but bad on video

sadimoodi commented 5 years ago

Hello Alex, everyone,

I am working with darknet.exe (with GPU) and everything is working well with still images, when i compile the SAME images into a video (MP4), i get very low detection rate (almost no detection on some images), whats wrong?

Note: i have trained on custom object (1 class ONLY).

LukeAI commented 5 years ago

can't give meaningful reply without far more info but maybe you are playing the video too fast or you have compressed the video too much or you have made a programmatic mistake in extracting a frame from the video stream.

AlexeyAB commented 5 years ago

Try to change 3 to 1 there: https://github.com/AlexeyAB/darknet/blob/2fa539779f4e12e264b9e1b2fc463ac7edec165c/src/demo.c#L40

If it doesn't help - then you something doing wrong.

sadimoodi commented 5 years ago

can't give meaningful reply without far more info but maybe you are playing the video too fast or you have compressed the video too much or you have made a programmatic mistake in extracting a frame from the video stream.

i have complied the video in FULL HD of the SAME images, i have also placed 3 seconds interval between the images, its just pretty straight forward, i am also not using any programming here, just darknet.exe from command line

sadimoodi commented 5 years ago

Try to change 3 to 1 there:

https://github.com/AlexeyAB/darknet/blob/2fa539779f4e12e264b9e1b2fc463ac7edec165c/src/demo.c#L40

If it doesn't help - then you something doing wrong.

@AlexeyAB thank you for your reply Alex, i have done the change you suggested but it didnt change anything, as mentioned above i am not writting any code, just using command line, i have followed your repository guidlines to the letter and still cant figure what could be wrong?

AlexeyAB commented 5 years ago

Can you show examples of detection on video and images?

sadimoodi commented 5 years ago

Can you show examples of detection on video and images?

Here is the still image, accuracy is 99%:

Here is the SAME image in a video using detector, accuracy drops down to 2%

AlexeyAB commented 5 years ago

Here is the still image, accuracy is 99%:

It is confidence_score. But yes, something is going wrong.

Alex, i have done the change you suggested but it didnt change

Did you recompile Darknet after changing this line? https://github.com/AlexeyAB/darknet/blob/2fa539779f4e12e264b9e1b2fc463ac7edec165c/src/demo.c#L40

i have complied the video in FULL HD of the SAME images, i have also placed 3 seconds interval between the images, its just pretty straight forward, i am also not using any programming here, just darknet.exe from command line

What FPS did you use for Video creation? Is it ~0.333 FPS or ~25 FPS?

During playing video with the same image on several frames, does confidence_score ~2% on all frames?

Can you share your cfg-file?

sadimoodi commented 5 years ago

Here is the still image, accuracy is 99%:

It is confidence_score. But yes, something is going wrong.

Alex, i have done the change you suggested but it didnt change

Did you recompile Darknet after changing this line?

ofcourse, i used Cmake, then visual studio ->build, release x64 then i threw the files in the release inside x64, then run darknet test, darknet demo

https://github.com/AlexeyAB/darknet/blob/2fa539779f4e12e264b9e1b2fc463ac7edec165c/src/demo.c#L40

i have complied the video in FULL HD of the SAME images, i have also placed 3 seconds interval between the images, its just pretty straight forward, i am also not using any programming here, just darknet.exe from command line

What FPS did you use for Video creation? Is it ~0.333 FPS or ~25 FPS?

its 30 Frames per second

During playing video with the same image on several frames, does confidence_score ~2% on all frames?

yes ALL frames get the same confidence_score of the same object at all times

Can you share your cfg-file?

here is my config file: https://moodiali-my.sharepoint.com/:u:/g/personal/ali_inteslar_com/EdyFcNtjR4tDm9ceOlcQIUIBYZV4J07IPN66LJKS4p4xKw?e=97XrwJ

thanks a lot!

AlexeyAB commented 5 years ago

Try to train by using this cfg-file (I added antialiasing=1): yolo-obj.cfg.txt

sadimoodi commented 5 years ago

Try to train by using this cfg-file (I added antialiasing=1): yolo-obj.cfg.txt

thanks, i will train tonight, can you explain what is this parameter? also my training images are approximately 546x1641 pixels, can i increase the size my network during training to 608 x 608?

sadimoodi commented 5 years ago

Try to train by using this cfg-file (I added antialiasing=1): yolo-obj.cfg.txt

i just got an error during traiing while using the config file that you suggested :"CUDA out of memory" although my subdivisions = 64 what to do?

AlexeyAB commented 5 years ago

also my training images are approximately 546x1641 pixels,

What GPU do you use?

Set width=256 height=768 and set random=0 in the last [yolo] layer

sadimoodi commented 5 years ago

also my training images are approximately 546x1641 pixels,

What GPU do you use?

Set width=256 height=768 and set random=0 in the last [yolo] layer

i am using Nvidia Geforce GTX 1070 , can you say why did you use these dimensions width=256 height=768 ?

AlexeyAB commented 5 years ago

to keep the same % of details for x and y. 256/546 ~= 50% 768/1641 ~= 50%

sadimoodi commented 5 years ago

also my training images are approximately 546x1641 pixels,

What GPU do you use?

Set width=256 height=768 and set random=0 in the last [yolo] layer

Hello @AlexeyAB I am getting worse results using the above settings, the original results were much better, what could be wrong? do u think i need to recaculate the anchors? so far i have used default anchors

sadimoodi commented 5 years ago

@AlexeyAB i would like to request for a trianing session (that i will pay for) just to familiarize my self with the configuration, can we do that?

AlexeyAB commented 5 years ago

So try to train width=608 height=608 random=0
You should recalculate anchors only if you are an expert in DNN and you can follow all these rules: https://github.com/AlexeyAB/darknet#how-to-improve-object-detection
Also show your Loss & mAP chart

sadimoodi commented 5 years ago

@AlexeyAB i would like to request for a trianing session (that i will pay for) just to familiarize my self with the configuration, can we do that?

@AlexeyAB can u answer this question pls?

sadimoodi commented 5 years ago

@AlexeyAB while trainining, my Nvidia GTX 1070 gets only 36% utilization, how can i force yolo to use more GPU power (like 90%) and hence maybe increase training speed?

AlexeyAB commented 5 years ago

What do you mean? Write me your suggestion alexeyab84@gmail.com

Compile darknet with OPENCV=1 GPU=1 CUDNN=1 Show screenshot of output of GPU-Z / nvidia-smi And show your Loss & mAP chart.png file

sadimoodi commented 5 years ago

@AlexeyAB i sent you an email from (ali@ainanolab.com), can you please reply? check ur junk folder

sadimoodi commented 5 years ago

here is my MAP

AlexeyAB commented 5 years ago

@sadimoodi

You get very high accuracy 99% mAP on test dataset.

It seems that you change somehow your images while saving them to the video.
Or you use incorrect command for detection on video.

Can you show screenshots of commands for detection on images and video?

Short way:

Try to make videofile with the highest quality (with minimum compression).

Long way:

Here is the SAME image in a video using detector, accuracy drops down to 2%

Try to grab frames from your video by using Yolo_mark: https://github.com/AlexeyAB/Yolo_mark run yolo_mark.exe data/img cap_video test.mp4 10

Mark objects on these frames by using Yolo_mark as usual: yolo_mark.exe data/img data/train.txt data/obj.names

And train your model by using these new images. Will object detection on video work perfectly?

sadimoodi commented 5 years ago

@sadimoodi

You get very high accuracy 99% mAP on test dataset.

It seems that you change somehow your images while saving them to the video.

Or you use incorrect command for detection on video.

Can you show screenshots of commands for detection on images and video?

command for detecting on an image: darknet detector test data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_best.weights data/test/26-9/1.jpg -thresh 0.01 -ext_output

for video: darknet detector demo data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_best.weights -ext_output data/test/26-9/newvideo.mp4 -thresh 0.01

Short way:

Try to make videofile with the highest quality (with minimum compression).

i have already tried that (FULL HD video 1080p)

Long way:

Here is the SAME image in a video using detector, accuracy drops down to 2%

Try to grab frames from your video by using Yolo_mark: https://github.com/AlexeyAB/Yolo_mark run yolo_mark.exe data/img cap_video test.mp4 10

Mark objects on these frames by using Yolo_mark as usual: yolo_mark.exe data/img data/train.txt data/obj.names

And train your model by using these new images. Will object detection on video work perfectly?

I havent tried this but i already did the labeling in the correct way and all seems good

AlexeyAB commented 5 years ago

I havent tried this but i already did the labeling in the correct way and all seems good

Do you mean that you solved this issue and this issue can be closed?

i have already tried that (FULL HD video 1080p)

I mean video-compression quality rather than video resolution.

command for detecting on an image: darknet detector test data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_best.weights data/test/26-9/1.jpg -thresh 0.01 -ext_output

for video: darknet detector demo data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_best.weights -ext_output data/test/26-9/newvideo.mp4 -thresh 0.01

Show screenshots.

sadimoodi commented 5 years ago

I havent tried this but i already did the labeling in the correct way and all seems good

Do you mean that you solved this issue and this issue can be closed?

No, i am still experimenting, issue remains, poor detection on videos but the more i train things are getting better

i have already tried that (FULL HD video 1080p)

I mean video-compression quality rather than video resolution.

OK, i cant control the compression quality, i am using MP4 file format, resolution is FULL HD

command for detecting on an image: darknet detector test data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_best.weights data/test/26-9/1.jpg -thresh 0.01 -ext_output for video: darknet detector demo data/obj.data cfg/yolo-obj.cfg backup/yolo-obj_best.weights -ext_output data/test/26-9/newvideo.mp4 -thresh 0.01

Show screenshots.

I have showed screenshots in the top of this thread.

can you please explain to me why did u suggest that random=0 at the last layer? what does this paramter do? and what does random = 1 do either?

AlexeyAB commented 5 years ago

I have showed screenshots in the top of this thread.

There is no screenshot of command.

sadimoodi commented 5 years ago

here is the result of the command for detection on still images: Confidence factor = 85% here is the result of the command for video detection:

confidence factor= 63%

you can see the difference in confidence factor between the still image and the video for the SAME image, i have used the best compression possible for the video.

again, can u explain what is random = 0 in the last layer?

AlexeyAB commented 5 years ago

show screenshot of commands rather than screenshot of result of commands. Check that you use identical cfg-files and other params in both commands
85% and 63% of confidence_score isn't a big difference. Check that you use identical resolution, aspect ratio and idecntical compression ratio in both video and image.
read about random: https://github.com/AlexeyAB/darknet/wiki/CFG-Parameters-in-the-different-layers random=1 can be used for training the model, so after training you can change width= height= in cfg-file by /1.4 - x1.4 without dropping in accuracy.

sadimoodi commented 5 years ago

here is the command of the image detection:

here is the command for video detection:

AlexeyAB commented 5 years ago

Check that you use identical resolution, aspect ratio and idecntical compression ratio in both video and image.

So this is the reason.

AlexeyAB / darknet

detection good on images but bad on video #4006