antoyang / TubeDETR

[CVPR 2022 Oral] TubeDETR: Spatio-Temporal Video Grounding with Transformers
Apache License 2.0
171 stars 8 forks source link

Performance Replication #26

Open GX77 opened 3 months ago

GX77 commented 3 months ago

Excellent pioneering work! I attempted to replicate the results on HCSTVG-v1. Since the paper did not mention the global batch size on HCSTVG, I trained with both 16 and 8 GPUs, but the highest m_vIoU I achieved was 31.6 (compared to 32.4 in the paper).