Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch
1.21k
stars
59
forks
source link
[Question] How media locations attribute help in interleaving visual tokens with text? #9
Open
PrithivirajDamodaran opened 2 years ago
Please advice on the usage.