ryouchinsa / Rectlabel-support

RectLabel is an offline image annotation tool for object detection and segmentation.
https://rectlabel.com
504 stars 73 forks source link

Video support #221

Open lonnylundsten opened 1 year ago

lonnylundsten commented 1 year ago

I have a couple of suggestions for additions to RectLabel:

  1. Add ability to adjust the frame captured from a video, i.e., instead of grabbing all frames allow the user to select the frames to grab: 1 frame per four seconds, etc.

  2. Ability to run inference on video clip and live video feed: https://developer.apple.com/documentation/vision/recognizing_objects_in_live_capture

  3. I'm not sure if you've seen this, but CoreMLPlayer has some really cool features: https://github.com/npna/CoreMLPlayer

I've asked the developer to allow for exporting to Yolo format, like RectLabel, and to incorporate a camera input. Adding the capability of his app to RectLabel would be really fantastic.

We've also developed a quicktime based video player that supports drawing localization boxes on the video: https://github.com/mbari-org/Sharktopoda/releases/tag/2.0.3

ryouchinsa commented 1 year ago

Thanks for writing the issue.

Thanks for introducing the CoreMLPlayer and we checked how it works. We will implement your feature requests one by one.

  1. Frames per a second
  2. Running a Core ML model to a video clip and save the yolo txt files.
  3. Running a Core ML model to a live captured video and save the yolo txt files.

For the Sharktopoda, do you have a document how to use it?

Thank you.

スクリーンショット 2023-05-01 18 35 50

lonnylundsten commented 1 year ago

Documentation is probably pretty sparse. From a users perspective, the instructions are here: https://docs.mbari.org/vars-annotation/setup/

The video player communicates with our annotation app, VARS, via UDP. The video player allows us to draw and display ML proposals on the video itself. VARS is reading/writing directly from a SQL Server database. The localizations are stored as a column in the db like: {"x": 1527, "y": 323, "width": 43, "height": 119, "generator": "yolov5-mbari315k" }. The class label would be in a different column.

ryouchinsa commented 8 months ago

I am sorry for late reply.

The new version 2023.12.08 was released. Improved so that when converting video to image frames, you can set frames per second.

I will implement one by one.