Closed Decem-Y closed 2 years ago
Hi @Decem-Y, we don’t have plans for it, esp because we focus on video-text tasks.
Hi @Decem-Y, we don’t have plans for it, esp because we focus on video-text tasks.
OK, thanks for your reply, but SNLI-VE is a fine-grained visual reasoning task to predict whether the relationship between an image and a text is entailment, neutral, or contradictory. Maybe its test data has been reported to be noisy.
Have you tried to verify the effect of the model on SNLI-VE dataset?