interesting video object segmentation (and tracking) result

facebookresearch / segment-anything

The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.

Apache License 2.0

45.63k stars 5.39k forks source link

interesting video object segmentation (and tracking) result #152

Open Pilot-LH opened 1 year ago

Pilot-LH commented 1 year ago

low

AbdelazizHamadi commented 1 year ago

Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !

kadirnar commented 1 year ago

Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !

Have you tried this repo? https://github.com/kadirnar/segment-anything-video

AbdelazizHamadi commented 1 year ago

Hello, that's really awesome results. Did you run it with cuda for all the frames ? If so, is it on your own environment ? I'm also trying to do the same thing with a video, but the cuda problem really confuses me. I explained everything here #153 any help would be appreciated !

Have you tried this repo? https://github.com/kadirnar/segment-anything-video

I just tried this repo u provided, and it gives the exact same problem with the GPU (no output nor an error message)

kadirnar commented 1 year ago

Can you share the image file?

Pilot-LH commented 1 year ago

I assume you're asking for the images for the demo above, it's a sequence called "horsejump-stick" from https://data.vision.ee.ethz.ch/csergi/share/davis/DAVIS-2017-test-dev-Full-Resolution.zip There are many other sequences in DAVIS dataset https://davischallenge.org/index.html And there are many other datasets for video object segmentation and tracking like YoutubeVOS

bhack commented 1 year ago

You can find other datasets at: https://youtube-vos.org/ https://henghuiding.github.io/MOSE/ https://github.com/Ali2500/BURST-benchmark

DavidTu21 commented 1 year ago

Hi that is a really interesting result and thank you for sharing! May I ask how you get rid of the background (and all other occlusions) but only keep on the people + horse? I tried to run SAM on a video but it seems that all objects in every frame will result in a different class, so I could not track the class which I am looking for in a video.

Pilot-LH commented 1 year ago

Initialization: first frame with mask of the foreground (in this case, the person and the horse)
Sample points in the foreground and use them as prompts to generate masks of different parts
Given a new frame, get new points by propagation of the previous points
Use new points as prompts to generate masks of different parts in the new frame
Keep track of the masks of different parts across frames by feature matching
Iterate 3-5 until the end of the sequence

bhack commented 1 year ago

Take a look at https://github.com/z-x-yang/Segment-and-Track-Anything