MineDojo / MineCLIP

Foundation Model for MineDojo
MIT License
226 stars 30 forks source link

A question about the transcript start/duration of minecraft youtube data. #8

Closed zhengsipeng closed 1 year ago

zhengsipeng commented 1 year ago

Hi, thank you for your collected youtube video and paired transcripts. However, I have a question about the transcripts. It appears that there is an issue with the start and duration times of the transcripts.

For example, in the QPIJs2vN9f0.srt file:

{"text": "in jaxx you what the [ ] what is it if not i was", "start": 0.0, "duration": 4.97}, {"text": "jewish he has since hired by", "start": 4.97, "duration": 6.64}, {"text": "zidane that's away yeah i always knew", "start": 8.58, "duration": 6.059}, {"text": "that a [ ] his annie hall headdress", "start": 11.61, "duration": 5.61}

for text1, the start/duration is 0.0/4.97, for text2, the start/duration is 4.97/6.64, for text3, the start/duration is 8.58/6.059.

Supposing the end time of text2 is 4.97+6.64=11.61, then there is an overlap between text2 and text3. How can I determine the exact end time of each transcript ?