declare-lab / MELD

MELD: A Multimodal Multi-Party Dataset for Emotion Recognition in Conversation
GNU General Public License v3.0
788 stars 200 forks source link

Videos are not well-aligned with the texts. #30

Open tae898 opened 3 years ago

tae898 commented 3 years ago

Obviously this issue was already brought up at https://github.com/declare-lab/MELD/issues/9

The alignment is pretty bad. It's hard for me to go multimodal at the moment, because of this issue.

I have two questions:

  1. Has this been fixed? Or are you planning on using a better alignment tool?
  2. Can I have access to the original friends videos? I wonder if I can cut the videos into utterances myself using ASR.