hkproj / pytorch-paligemma

Coding a Multimodal (Vision) Language Model from scratch in PyTorch with full explanation: https://www.youtube.com/watch?v=vAmKB7iPkWw
320 stars 57 forks source link