[Question] How media locations attribute help in interleaving visual tokens with text?

lucidrains / flamingo-pytorch

Implementation of 🦩 Flamingo, state-of-the-art few-shot visual question answering attention net out of Deepmind, in Pytorch

MIT License

1.21k stars 59 forks source link

Open PrithivirajDamodaran opened 2 years ago

PrithivirajDamodaran commented 2 years ago

Please advice on the usage.

mcihadarslanoglu commented 1 year ago

Would be great.