declare-lab / MELD

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
GNU General Public License v3.0
788 stars 200 forks source link

Multiple issues in the dataset. #9

Closed saxenarohit closed 5 years ago

saxenarohit commented 5 years ago
  1. Audio

There is a disturbance in audio which would have affected the audio features.

Few Examples: dia793_utt0.mp4 dia164_utt5.mp4 dia682_utt1.mp4 dia529_utt2.mp4 dia1029_utt1.mp4 dia1008_utt1.mp4

Mostly all videos with size > 2.5 MB (around 200 videos in train_set)

  1. Video and text are not matching.

For example

a) dialogue 241. In utterance 1 the sync breaks between the text and the video utterance 2 in text is "I asked him." while video dia241_utt2.mp4 has just word "now" and the sync issues goes on.

b) dialogue 757 utterance 7 is also not synced with the text.

c) diaglogue 485 utterance 0 in text "Hey, this- Heyy..." but the video is a long clip.

There are many more video-text sync issues.

Is this dataset usable? Please help me with this.

soujanyaporia commented 5 years ago

As discussed over email, there are some alignment issues because of the auto-aligner Gentle that we used. We have to manually fix such issues and we plan to do the same in near future. Be rest assured that such videos are very less in number and do not trouble the overall quality of the dataset. Thanks!