zjr2000 / GVL

Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
https://arxiv.org/abs/2303.06378
MIT License
25 stars 6 forks source link

the metric of results on ActivityNet with ground-truth proposals #8

Open Jack2Lu opened 2 months ago

Jack2Lu commented 2 months ago

Hello, thank you for sharing this novel work, I have read your paper and have the question that have you run the model on ActivityNet with ground-truth proposals?Are the results better when compared with PDVC? I would be grateful if you can reply to me