dandelin / ViLT

Code for the ICML 2021 (long talk) paper: "ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision"
Apache License 2.0
1.41k stars 208 forks source link

About SBU Caption dataset #68

Open 4fee8fea opened 2 years ago

4fee8fea commented 2 years ago

Hi @dandelin, Thanks for your great work and make it public!

we wanna follow your work, but the SBU Caption dataset becomes an obstacle. The URL has been inaccessible.

Could you please offer me a copy file including the [url - caption] pairs?

Thanks in advance!

bfshi commented 2 years ago

This link works:

https://www.cs.rice.edu/~vo9/sbucaptions/sbu-captions-all.tar.gz