z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
GNU Affero General Public License v3.0
2.83k stars 340 forks source link

Real-time capability and fine-tuning a model #110

Open Helge543 opened 1 year ago

Helge543 commented 1 year ago

Hi,

I hope this is the right place for my questions. The framework looks very promising to me and exciting but I am not 100% sure if I can use it for my project. I still have some questions that are important to me:

  1. Is it real-time capable, i.e. is it applicable to frames (every Nth frame/image) transmitted from an external camera device to my PC via Wi-Fi assuming that Wi-Fi connection is working well and "good" PC hardware is used (the graphics card I am using for example is an RTX 4070 Ti)?

  2. Can a model be fine-tuned on my own data set with several different classes and if so, exists code for training and how long does it take approximately on the already mentioned card or is it just ridiculous and far beyond it?

Thanks in advance!

yamy-cheng commented 12 months ago

Hello, thank you for your interest.

  1. SAM-Track will use SAM to obtain annotations of objects in key frames, but its real-time capability cannot be guaranteed.
  2. Do you mean using your own dataset to train DeAOT (Tracking model used in SAM-Track)? I'm not sure if it will work. You can find more information about how to train DeAOT in there.