RupertLuo / Valley

The official repository of "Video assistant towards large language model makes everything easy"
199 stars 13 forks source link

VATEX data process: Do we need to split VATEX video into clips? #20

Open fightingaaa opened 10 months ago

fightingaaa commented 10 months ago

Hi~ Thanks for share your great work. I see annotations like below:

{'id': 'VATEX_zkbnKBewRLA_000069_000079', 'v_id': 'zkbnKBewRLA_000069_000079', 'video': 'v_zkbnKBewRLA.mp4', 'source': 'VATEX', 'conversations': [{'from': 'human', 'value': '

Do we need to split VATEX video into clips? For example, cut the video _vzkbnKBewRLA.mp4 into _zkbnKBewRLA_000069000079 .

yeliudev commented 6 months ago

Same question.

RupertLuo commented 6 months ago

yes

------------------ 原始邮件 ------------------ 发件人: "RupertLuo/Valley" @.>; 发送时间: 2024年2月19日(星期一) 下午4:39 @.>; @.***>; 主题: Re: [RupertLuo/Valley] VATEX data process: Do we need to split VATEX video into clips? (Issue #20)

Same question.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

yeliudev commented 6 months ago

yes ------------------ 原始邮件 ------------------ 发件人: "RupertLuo/Valley" @.>; 发送时间: 2024年2月19日(星期一) 下午4:39 @.>; @.>; 主题: Re: [RupertLuo/Valley] VATEX data process: Do we need to split VATEX video into clips? (Issue #20) Same question. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.>

Many thanks for your reply! I also noticed that VATEX annotations in Valley_instruct_73k.json were repeated twice (but jukin annotations were not). I was wondering why using such a setting?

RupertLuo commented 6 months ago

This is a bug since I process the data at first, the actual number of valley_instruct is less than 73k, I will put a notice Readme, and I will fix this bug ASAP. You can mix the three file bellew to get the right data.

image
yeliudev commented 6 months ago

I see. Thank you!