dandelin / ViLT

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Apache License 2.0
1.41k stars 208 forks source link

ViLT on GQA #85

Open keshavshivkumar opened 1 year ago

keshavshivkumar commented 1 year ago

I am a MSCS student at Rutgers University. Me and my teammates fine-tuned ViLT on the GQA dataset. It was a great experience, learning how to peruse high quality code and applying the concepts onto a different dataset. CS-534_Project_Report.pdf