Import model output for validating/editing

lonnylundsten commented 2 years ago

I want to use the detection output from yolov5 (https://github.com/ultralytics/yolov5) in such a way that I can validate/cleanup the ml proposals in RectLabel and then re-incorporate into a new training set.

I would like to open output from yolov5 detect.py in RectLabel so I can 1) cleanup and edit the model detection proposals and 2) prep that data for generating a new training set or further add to my existing training data.

Would it be possible for RectLabel to open a video and the detection yolo formatted output txt from detect.py and grab the corresponding frames and generate new localized data from that. Note -- I've also contacted the yolov5 group to see if they could output data in a format that can be easily opened in RectLabel.

ryouchinsa commented 2 years ago

Thanks for writing the issue.

In the yolov5 folder, when we ran detect.py to a video file, it saves labels folder which includes yolo txt files for all frames of the video. python3 detect.py --source Pexels\ Videos\ 4800.mp4 --save-txt

We will update "Convert video to image frames" feature so that it saves all frames of the video according to the frames per second of the video. So that every image file exported from RectLabel can correspond to the yolo txt file generated from detect.py.
We will update "Import YOLO txt files" feature so that it can import yolo txt files generated from detect.py. Currently the naming rule for each frame is slightly different. RectLabel: Pexels Videos 4800_frame001.txt yolov5: Pexels Videos 4800_1.txt

"Convert video to image frames" will be changed to save the first frame as Pexels Videos 4800_001.jpg. "Import YOLO txt files" will be changed to read Pexels Videos 4800_001.txt, and if not exists, read Pexels Videos 4800_1.txt.

When we could implement this update, we will let you know. Please let us know your opinon.

lonnylundsten commented 2 years ago

This sounds like a great solution. Thank you!

ryouchinsa commented 2 years ago

The new update version 65 was released.

We improved "Convert video to image frames" to export every frame of the video corresponding to the labels folder generated by detect.py in the yolov5 folder.

When you ran detect.py to a video, the labels folder is generated. python3 detect.py --source Pexels\ Videos\ 4800.mp4 --save-txt

In the labels folder, there are yolo txt files. スクリーンショット 2022-10-12 3 01 48

Please use "Convert video to image frames" on RectLabel to the video, choose the second frame suffix option, then you will obtain image frames corresponding to the yolo txt files in the labels folder.

Please let us know your opinion.

lonnylundsten commented 2 years ago

Excellent. I will test it right away!

lonnylundsten commented 2 years ago

OK. So, this mostly works as expected, however, I do have two issues.

I have to quit all other programs for the 16,000 images of my detections to be grabbed at 2048 pixels, otherwise it runs out of memory. Smaller dimensions don't cause this issue.
The boxes are often on the object of interest, but occasionally they are off or seem to drift -- see attached.

ryouchinsa commented 2 years ago

Thanks for the detailed feedback.

We were testing for short video files. Using SnapSave, we prepared 2 long video files. https://snapsave.io/en

Soccer video file https://www.youtube.com/watch?v=M7yI1V4O2mc

Startup video file https://www.youtube.com/watch?v=OKD0IAcwMig

Both video files have more than 10000 image frames and the image size is 1920x1080.

For the heap memory problem, we improved our code not to create intermediate files during processing, and for both 2 video files, in our environment, the heap memory during the processing was less than 80MB.

We submitted the new update, when the new update is released, we will let you know.

For the coordinates problem, could you send the image frame and the yolo txt file generated by yolov5 to support@rectlabel.com?

ryouchinsa commented 2 years ago

The new update version 66 was released.

We improved the heap memory usage and error processing for "Convert video to image frames".

Please let us know your opinion.

lonnylundsten commented 2 years ago

It still crashes when converting a 2048 x 1080 60 FPS 00:07:11 (~7 minute) ProRes HQ video file.

My computer specs are attached. Screen Shot 2022-10-12 at 4 38 19 PM

ryouchinsa commented 2 years ago

Thanks for the detailed feedback.

We were testing for 30fps videos. Using iPad, we took 60fps and more than 10 minutes videos. We could reproduce the application memory problem.

We were retrieving all image frames from the video file at once, so that we divided the processing each by 1000 image frames. With this update, the processing goes through to the end without encountering the application memory problem.

We submitted the new update to Apple, when the new update is released, we will let you know.

ryouchinsa commented 2 years ago

The new update version 67 was released.

We improved the application memory usage for "Convert video to image frames" when to apply to 60fps more than 10 minutes videos.

Please let us know your feedback.

lonnylundsten commented 2 years ago

Yes, that worked well. Thank you.

ryouchinsa commented 2 years ago

If you have any other requests, please let us know.

ryouchinsa / Rectlabel-support

Import model output for validating/editing #203