facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Apache License 2.0
45.72k stars 5.41k forks source link

Let's add this to an image labeling tool like LabelImg! #83

Open EdjeElectronics opened 1 year ago

EdjeElectronics commented 1 year ago

Nice work, SAM is really cool! The first use case that pops into my mind is image labeling. Click on the object of interest, and you immediately get a bounding box or segmentation mask for the object. How can we add this to an open-source labeling tool like LabelImg? https://github.com/heartexlabs/labelImg image

krishnaadithya commented 1 year ago

Should be easy but the problem would be right now that the model needs at least 16 gb VRAM GPU to make an inference.

EmrahErden commented 1 year ago

hahaha, using a segmentation model to create training data for another segmentation model (such as yolov8): segmentception!

actually it is a clever idea but the issue here is, errors in the output of the first segmentation model will be carried forward to the second model

alexeysi commented 1 year ago

Should be easy but the problem would be right now that the model needs at least 16 gb VRAM GPU to make an inference. Such requirement is not a big problem. With manual labeling you will not create a huge server loading. And the best way is to create lxc/docker image with rest/grpc service to segment posted images. After that no problem to use it in LabelImg, Supervisely or Roboform)

yogendra-yatnalkar commented 1 year ago

Should be easy but the problem would be right now that the model needs at least 16 gb VRAM GPU to make an inference.

@krishnaadithya One way I was thinking to deal with this was to generate the model embeddings before hand on colab or some cloud provider for all the input images and store it in some format eg. npy. Then, on our local setup even with a small laptop gpu, we will be able to tag the images properly.

yogendra-yatnalkar commented 1 year ago

hahaha, using a segmentation model to create training data for another segmentation model (such as yolov8): segmentception!

actually it is a clever idea but the issue here is, errors in the output of the first segmentation model will be carried forward to the second model

@EdjeElectronics Yeah it wont be perfect but very useful as the initial data will be segmented and our labelers/taggers our job will be to refine the segmented region properly instead of segmenting from scratch.

EmrahErden commented 1 year ago

@yogendra-yatnalkar that is a good idea really, if coupled with a nice gui (with eraser, and markers options to fix the borders, or combine the split regions, coupled with practical labeling). but SAM's inference speed is a bit slow, that may be the problem when dealing with large number of images

anuragxel commented 1 year ago

Might be useful, basically does what's being discussed here. Wrote a barebones GUI over SAM's decoder to label objects quickly and save them in COCO format. Requires you to extract embeddings for your images first and I have added a small helper script for that, it can be a part of the GUI most likely. Any contributions appreciated.

https://github.com/anuragxel/salt

amartya-mandal-joulea commented 1 year ago

hahaha, using a segmentation model to create training data for another segmentation model (such as yolov8): segmentception!

actually it is a clever idea but the issue here is, errors in the output of the first segmentation model will be carried forward to the second model

Since SAM doesn't churn out labels, I feel its necessary

vietanhdev commented 1 year ago

Check out my work of Segment Anything integration to AnyLabeling - a fork of LabelMe. Video: https://www.youtube.com/watch?v=5iQSGL7ebXE Note: Work in progress. Welcome your comments. image

captainfffsama commented 1 year ago

I have created a simple gRPC tool for SAM that allows users to create SAM models on a server, while the local client only needs to receive the server's results. This means that when improving tools like labelimg in the future, we can create models on powerful servers and the local computer can obtain results in real-time through network communication.

the simple grpc tool of SAM: https://github.com/captainfffsama/sam_grpc

fedesemeraro commented 1 year ago

I also created a matplotlib GUI to interactively create and export masks, which can be run either in Colab or locally: https://github.com/fsemerar/segment-anything-gui

fedesemeraro commented 1 year ago

You can now use SAM within 3D Slicer using the TomoSAM extension. We also added a video tutorial to get a quick start

tomosam_screenshot_1