google-research / big_vision

Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
Apache License 2.0
2.2k stars 147 forks source link

Megatron support for decoding bs=1 in PaliGemma #121

Closed lucasb-eyer closed 2 months ago

lucasb-eyer commented 2 months ago

Also first in a separate commit a bunch of minor changes we accumulated.

Megatron work was done by @andsteing and @andresusanopinto, credited in that commit.