zinengtang / TVLT

PyTorch code for “TVLT: Textless Vision-Language Transformer” (NeurIPS 2022 Oral)
MIT License
120 stars 13 forks source link