The dataset consists of public social media url pairs and the corresponding entailment label for an external conference (ACL 2021). Each url contains a post with both linguistic (text) and visual (image) content. Entailment labels are human annotated through Google Crowdsource.
Hi @cesar-ilharco!
Is there a GitHub repository of the website?
https://multimodal-entailment.github.io/
We'd love to adapt it for a tutorial we're about to present at ACM KDD this year.