Open ATriantafyllopoulos opened 4 years ago
I used this link to download the audio files for this data set.我使用此链接下载了该数据集的音频文件。
However, there are a few problems with at least a few of the video files and/or their transcriptions:然而,至少有一些视频文件和/或其转录存在一些问题:
- [ ] dia309_utt0.mp4: transcription contains description of scene which needs to be removed ("She doesn't hear him and keeps running, Chandler starts chasing her as the theme to")dia309_utt0.mp4:转录包含需要删除的场景描述(“她没有听到他的声音并继续奔跑,钱德勒开始追她作为主题”)
- [ ] test_splits_wav/dia220_utt0.mp4: file is wrongly cut (video is 4min long - transcription is way off as it's Ross and Julie meeting Rachel at the airport, not Phoebe talking to Joey )test_splits_wav/dia220_utt0.mp4:文件被错误地剪切(视频长度为 4 分钟 -转录内容相差很远,因为这是罗斯和朱莉在机场遇见瑞秋,而不是菲比与乔伊交谈)
- [ ] test_splits_wav/dia38_utt4.mp4: file is wrongly cut (video is 5min long)test_splits_wav/dia38_utt4.mp4:文件被错误剪切(视频长5分钟)
- [ ] train_splits_wav/dia309_utt0.mp4: file is wrongly cuttrain_splits_wav/dia309_utt0.mp4:文件被错误剪切
In addition, I was able to verify that some of the old problems reported here still persist (e.g. dia793_utt0.mp4).此外,我还能够验证此处报告的一些旧问题仍然存在(例如 dia793_utt0.mp4)。
Have they been solved? Have I perhaps downloaded an old version of the data set?他们解决了吗?我是否下载了旧版本的数据集?
Imeet the same issue.Have you solved them?
I used this link to download the audio files for this data set.
However, there are a few problems with at least a few of the video files and/or their transcriptions:
[ ] dia309_utt0.mp4: transcription contains description of scene which needs to be removed ("She doesn't hear him and keeps running, Chandler starts chasing her as the theme to")
[ ] test_splits_wav/dia220_utt0.mp4: file is wrongly cut (video is 4min long - transcription is way off as it's Ross and Julie meeting Rachel at the airport, not Phoebe talking to Joey )
[ ] test_splits_wav/dia38_utt4.mp4: file is wrongly cut (video is 5min long)
[ ] train_splits_wav/dia309_utt0.mp4: file is wrongly cut
In addition, I was able to verify that some of the old problems reported here still persist (e.g. dia793_utt0.mp4).
Have they been solved? Have I perhaps downloaded an old version of the data set?