hkproj / pytorch-paligemma

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw
226 stars 40 forks source link