hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
https://hkchengrex.com/Tracking-Anything-with-DEVA/
Other
1.24k stars 128 forks source link

Can I use it for ultrasound video object segmentation? #38

Closed JawadTawhidi closed 10 months ago

JawadTawhidi commented 11 months ago

Hi, Thank you so much for your amazing work. However, I wanted to ask that if I want to use your approach for Ultrasound Video Object Segmentation using Unsupervised approach, is it advised? If so, which part could be suitable for the task? The part of DEVA which is for Unsupervised Video Object Segmentation? I could not find more details about implementation of that part in the paper, I want to know for Unsupervised Video Object Segmentation which image segmentation model is used and which model is used to propagate the temporal information.

My Ultrasound dataset is for Lypphoma cancer and it is consists of some annotated videos and also some annotated images.

Looking forward to hearing from you.

hkchengrex commented 11 months ago

The details are listed in the appendix. The instructions here https://github.com/hkchengrex/Tracking-Anything-with-DEVA/blob/main/docs/EVALUATION.md#open-worldlarge-vocabularyunsupervised-video-object-segmentation and https://github.com/hkchengrex/Tracking-Anything-with-DEVA/blob/main/docs/EVALUATION.md#unsupervised-salient-video-object-segmentation might help. You can also try the demo https://github.com/hkchengrex/Tracking-Anything-with-DEVA/blob/main/docs/DEMO.md.

Unfortunately, I do not have the domain knowledge to tell which approach is the "best".

JawadTawhidi commented 11 months ago

My dataset is such that, in some videos it has single object and in some videos has more than one objects, in that case if I use your pretrained temporal model as temporal propagation and for image segmentation model follow the approach you did for DAVIS-2017(Multi object unsupervised video object segmentation), then can I compare the results with other models? I mean can I say this is DEVA's results on our dataset?

hkchengrex commented 11 months ago

If a reasonable image detection model is used, I don't see why not.

JawadTawhidi commented 11 months ago

Sorry for disturbing so much, my main question is that if I don't choose any other image detection model, I use the same image detection and temporal propagation models which you have used for DAVIS 2017, then can I say this is the result of DEVA on my dataset? or I have to find a specific image detection model for my data and use your temporal propagation model for comparison.

hkchengrex commented 11 months ago

No matter which image detector you use, as long as you specify it, I don't see why you don't state it that way. I cannot comment on whether it poses a fair comparison or presents meaningful results -- I think you would be a much better judge of that.

JawadTawhidi commented 11 months ago

Ok. Thank you so much for your patience. I appreciate it.

However, I want to have your advise in more thing, as I said before my data set is such that in some videos it has single object and in some videos has more than one objects, in this case using the approash you used for DAVIS 2016 is advised or the one which you used for DAVIS 2017?

hkchengrex commented 11 months ago

The DAVIS 16 approach only works for a single (salient) object and would not work for videos with multiple objects.