An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) for key-frame segmentation and Associating Objects with Transformers (AOT) for efficient tracking and propagation purposes.
I hope this is the right place for my questions. The framework looks very promising to me and exciting but I am not 100% sure if I can use it for my project. I still have some questions that are important to me:
Is it real-time capable, i.e. is it applicable to frames (every Nth frame/image) transmitted from an external camera device to my PC via Wi-Fi assuming that Wi-Fi connection is working well and "good" PC hardware is used (the graphics card I am using for example is an RTX 4070 Ti)?
Can a model be fine-tuned on my own data set with several different classes and if so, exists code for training and how long does it take approximately on the already mentioned card or is it just ridiculous and far beyond it?
SAM-Track will use SAM to obtain annotations of objects in key frames, but its real-time capability cannot be guaranteed.
Do you mean using your own dataset to train DeAOT (Tracking model used in SAM-Track)? I'm not sure if it will work. You can find more information about how to train DeAOT in there.
Hi,
I hope this is the right place for my questions. The framework looks very promising to me and exciting but I am not 100% sure if I can use it for my project. I still have some questions that are important to me:
Is it real-time capable, i.e. is it applicable to frames (every Nth frame/image) transmitted from an external camera device to my PC via Wi-Fi assuming that Wi-Fi connection is working well and "good" PC hardware is used (the graphics card I am using for example is an RTX 4070 Ti)?
Can a model be fine-tuned on my own data set with several different classes and if so, exists code for training and how long does it take approximately on the already mentioned card or is it just ridiculous and far beyond it?
Thanks in advance!