cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.49k stars 3k forks source link

Replace SAM with SAM 2 #8231

Open RHogan615 opened 3 months ago

RHogan615 commented 3 months ago

Actions before raising this issue

Is your feature request related to a problem? Please describe.

While SAM is a good model, it would be nice to have the 'latest and greatest' capabilities.

Describe the solution you'd like

Replace the current SAM model with SAM 2

Describe alternatives you've considered

Replace the SAM implementation with the newest version

Additional context

https://github.com/facebookresearch/segment-anything-2

medphisiker commented 3 months ago

I wanted to ask you to please not replace SAM with SAM 2. Instead, I would like to request that you add another new model, SAM 2, to the existing models =)

RHogan615 commented 3 months ago

I wanted to ask you to please not replace SAM with SAM 2. Instead, I would like to request that you add another new model, SAM 2, to the existing models =)

Fair point

HanClinto commented 3 months ago

Especially for the tracking capabilities of SAM2, it feels like this would be a powerful addition for marking an object once and tracking it through a whole video.

HanClinto commented 3 months ago

Because SAM 2 has so many new capabilities, it feels like it can function both as an interactor as well as a tracker, and I'm not entirely sure which "bucket" to put it into to really take advantage of its full capabilities within CVAT.

It's almost like we need to extend the serverless functions to include a new class that is an "interactive tracker" to really utilize everything that SAM 2 has to offer.

nmanovic commented 2 months ago

Hi, we have added SAM2 on SaaS (https://app.cvat.ai/) and for Enterprise customers: https://www.cvat.ai/post/meta-segment-anything-model-v2-is-now-available-in-cvat-ai

We are working to add tracking capabilities of SAM2 into CVAT. Thanks for your comments.

Youho99 commented 2 months ago

Another possible and very useful feature with models like SAM and SAM2 would be precision annotation in bounding boxes.

The idea is to make an imprecise bounding box around the object to be annotated. The bounding box is sent to the SAM or SAM2 model, which segments the main object from the bounding box it receives. Finally, the precise bounding box is recreated by taking the extremum coordinates at the top, left, bottom, right.

This would allow very quick and precise annotating, without having to zoom in on the image (very useful for precise annotation of small objects for example).

In my free time, I made a python script using this logic with SAM to make precision annotation, taking as input an annotation json (COCO format I think) and which output a json in the same format, with the precise bounding boxes recalculated.

I could make it available to you if necessary.

grzleadams commented 2 months ago

Hi, we have added SAM2 on SaaS (https://app.cvat.ai/) and for Enterprise customers: https://www.cvat.ai/post/meta-segment-anything-model-v2-is-now-available-in-cvat-ai

We are working to add tracking capabilities of SAM2 into CVAT. Thanks for your comments.

Is there any guidance yet on how to deploy SAM2 as a model for self-hosted CVAT? I don't see a nuclio config for it.

Youho99 commented 2 months ago

Hi, we have added SAM2 on SaaS (https://app.cvat.ai/) and for Enterprise customers: https://www.cvat.ai/post/meta-segment-anything-model-v2-is-now-available-in-cvat-ai We are working to add tracking capabilities of SAM2 into CVAT. Thanks for your comments.

Is there any guidance yet on how to deploy SAM2 as a model for self-hosted CVAT? I don't see a nuclio config for it.

+1

alex-bronze-vision commented 2 months ago

Hi, we have added SAM2 on SaaS (https://app.cvat.ai/) and for Enterprise customers: https://www.cvat.ai/post/meta-segment-anything-model-v2-is-now-available-in-cvat-ai We are working to add tracking capabilities of SAM2 into CVAT. Thanks for your comments.

Is there any guidance yet on how to deploy SAM2 as a model for self-hosted CVAT? I don't see a nuclio config for it.

+1 too

amrosado commented 1 month ago

Are there plans to bring this feature to non-enterprise users? I couldn't easily find a branch with this feature being worked, but I can see support of bounding box input added in the develop branch. This feature would be great for a project I'm working on.

earzamastsev commented 1 month ago

There are approximate plans for the implementation of SAM2 tracking (next month, before the end of the year...)?

nmanovic commented 1 month ago

@earzamastsev , we want to add SAM2 tracking in the next quarter. It will be an Enterprise feature as well. Thus users of SaaS and our Enterprise customers will be able to use it for their projects.

@amrosado , we don't have such plans. Please use SAM instead. It has really great quality. The primary benefit of SAM2 is the video tracking capabilities and inference speed. Again, if you want to support development of the open-source project, just buy a subscription and use all our paid features. For now there are no any other hidden or extra costs. Such simple action allows us to fix bugs and add new features. Otherwise, community can contribute their source code. We will keep it inside one of our repos or point in our documentation.

ryanalexmartin commented 1 month ago

just buy a subscription and use all our paid features.

This comes across as quite out-of-touch, and rather rude considering CVAT has over 252 contributors, and according to your website you have 10 employees.

Otherwise, community can contribute their source code.

Somebody contributed a SAM2 interactor that you guys still refuse to merge in...

nmanovic commented 1 month ago

@ryanalexmartin, thank you for your comments and for submitting an issue several years ago. We value every community member and contributor. All contributions from our community are currently open-source, and we have supported these contributions for years. You can always use the contributed SAM2. It is a separate directory with files for the Nulcio serverless framework. Could you clarify the problem? The PR hasn't been merged yet because it still has issues with the linter. (@azhavoro , could you please ping the author of the PR and ask to fix remaining issues?)

To keep CVAT alive and valuable for the open-source community, we develop it according to open-core principles. Most of the functionality remains open-source, but key features that help automate data annotation, quality control, and team management will be provided to customers who support the project. Any organization can pay us once to implement a feature for the open-source community: https://www.cvat.ai/post/request-features. This approach has helped introduce several important features like skeleton support, quality control, and webhooks.

P.S. Feel free to contribute any feature to CVAT to make it available for the open-source community. However, please keep in mind that you will need to support it over the years as well.

jeanchristopheruel commented 5 days ago

Lol lets start a CVAT2 official fork that integrate SAM2 tracking + full frontend encoding + decoding for better control over PRs competing with commercial features. Community > Commercial.