cvat-ai / cvat

Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
https://cvat.ai
MIT License
12.19k stars 2.95k forks source link

How should I modify the code if I replace SAM with SAM-hq? #6499

Open Whoo-jl opened 1 year ago

Whoo-jl commented 1 year ago

SAM-hq maintains SAM's original promptable design, efficiency, and zero-shot generalizability, but the image encoding module has two feature vector outputs. If I want to replace SAM with SAM-hq, what modifications do I need to make?

I added a const "interm_embeddings" in the file "/cvat/cvat-ui/plugins/sam_plugin/src/ts/index.tsx", but after compiling and restarting CVAT, the page shows an error message "Error: invalid input 'interm_embeddings'".

bsekachev commented 1 year ago

Hi @Whoo-jl

There are a lot of changes need to be done to integrate new model. I will mark the issue as enhancement, the model looks interesting.

Stijnp commented 7 months ago

Any updates on this? There have been a lot of performance improvements in recent implementations of SAM such as EfficientSAM reducing the hardware requirements to run models for pre labelling.

realtimshady1 commented 2 months ago

I'm also interested in this enhancement. I also would be happy to do the work myself to implement but unsure what changes are needed.

Any advice would be greatly appreciated

realtimshady1 commented 1 month ago

I've implemented SAM-HQ in my own fork. It works but in my opinion it's not much better than regular SAM currently. I have yet to quantize the model which will give a boost in speed. Any suggestions for how it can be improved would be appreciated.

https://github.com/realtimshady1/cvat/tree/sam-hq