Hon-Wong / Elysium

[ECCV 2024] Elysium: Exploring Object-level Perception in Videos via MLLM
https://hon-wong.github.io/Elysium/
59 stars 2 forks source link

Raw Results of LaSOT and UAV123 #13

Closed yangchris11 closed 1 month ago

yangchris11 commented 1 month ago

Thank you for the awesome work and dataset!

I was wondering if you can also provide the zero-shot raw (tracking) results of LaSOT and UAV123?

Thank you very much!

Hon-Wong commented 1 month ago

Sure, I'll share the raw results in 1 or 2 days. It will be provided here.

BW, Han

yangchris11 commented 1 month ago

Thank you very much! I'll take a look into that!

Closing the issue :+1:

yangchris11 commented 1 month ago

FYI, I found there are a couple missing bracklets in the json result files

Screenshot 2024-10-24 at 14 53 53 Screenshot 2024-10-24 at 14 55 56
Hon-Wong commented 1 month ago

Yes, the output of MLLM collapses occasionally. For a strict evaluation, we simply pad with [0,0,0,0] for these cases. I guess trying a different prompt might solve the problem.