zhangxi1997 / ECMR-VCR

The coder for the paper "Explicit Cross-Modal Representation Learning for Visual Commonsense Reasoning"
4 stars 0 forks source link