EasonXiao-888 / UVCOM

[CVPR 2024] Bridging the Gap: A Unified Video Comprehension Framework for Moment Retrieval and Highlight Detection
MIT License
66 stars 4 forks source link

How to get the input words in inferring the highlight detection? #5

Open ireneMsm2020 opened 2 months ago

ireneMsm2020 commented 2 months ago

A good work! But I have a question. When I have a video to get the highlight clip, what is the words I need to input ? just like "Man and women are dancing together" in Figure 2. How to get it?

EasonXiao-888 commented 2 months ago

@ireneMsm2020 I'm sorry for not replying in time. The query is customized by the user. "Man and women are dancing together" this query is collected from the QVHighlight dataset

ohhyonghee commented 3 weeks ago

@EasonXiao-888 I understood that Highlight Detection "does not need" a query (whereas Moment Retrieval needs it). so, What query should I input to detect highlight?