patrick-tssn / VSTAR

[ACL2023] VSTAR is a multimodal dialogue dataset with scene and topic transition information
https://vstar-benchmark.github.io/
12 stars 2 forks source link