hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation
https://hkchengrex.com/Tracking-Anything-with-DEVA/
Other
1.27k stars 129 forks source link

Unexpected segmentation output #46

Closed JawadTawhidi closed 11 months ago

JawadTawhidi commented 11 months ago

Hi, I generated some detections using my own model, then I wanted to Use DEVA's temporal propagation approach to propagate temporal information, but after running the DEVA's temporal propagation model, the output becomes totally different and the mean of J and F is 8. I am confused if DEVA is accepting any kind of detection. This is the segmentation before applying DEVA: BeforeDeva

And this is the segmentation mask after applying DEVA:

afterdeva

JawadTawhidi commented 11 months ago

Please, any solution?

hkchengrex commented 11 months ago

There is not enough information for me to figure out what is going on here. Can you start from the provided examples and see if they work? If they do, what are the differences between your output with the examples? Figuring out and correcting those differences will likely solve your problem.

JawadTawhidi commented 11 months ago

I have used your precomputed detections for DAVIS 2016 and DAVIS 2017, both worked and even I got higher results compare to your paper's results(Which is likely due to differences between the devices)

But when I tried my detections it is coming like that. I have to mention one thing which is that while I generate segmentation masks I only can use https://github.com/davisvideochallenge/davis-2017 for calculating the J and F values. But when I want to use your final results in https://github.com/davisvideochallenge/davis-2017 for calculating J and F values, this approach is not working on your results. Maybe there is some differences between the format of the segmentation masks, but I don't know how to solve this differences, by making changes in the decoder of my model?

hkchengrex commented 11 months ago

What are the differences between your output with the examples? Figuring out and correcting those differences will likely solve your problem.

JawadTawhidi commented 11 months ago

Right now the difference which I can see is that my detections are colored and yours are not.

Would you please give me some clue on which possbible kinds of differences should I check? does this colored format also matters?

hkchengrex commented 11 months ago

The coloring only partially reflects the underlying data structure and representation. You can inspect what the dataloader sees by opening the image with PIL.

hkchengrex commented 11 months ago

Closed due to inactivity.