Soldelli / MAD

MAD: A Scalable Dataset for Language Grounding in Videos from Movie Audio Descriptions
MIT License
149 stars 3 forks source link

Figure #1

Closed Soldelli closed 2 years ago

Soldelli commented 2 years ago

mad