jiaweizzhao / GaLore

GaLore: Memory-Efficient LLM Training by Gradient Low-Rank Projection
Apache License 2.0
1.24k stars 131 forks source link

Support for Jamba (ai21labs/Jamba-v0.1) #34

Open creatorrr opened 3 months ago

creatorrr commented 3 months ago

Jamba is a very interesting new model and I’d love to add support for galore for finetuning it. It’s an MoE+Transformer+Mamba hybrid so I’m not sure how that would work with Galore.

thoughts/pointers? @jiaweizzhao @agnim25 @darthjaja6

jlamprou commented 3 months ago

+1

Jamba is a very interesting new model and I’d love to add support for galore for finetuning it. It’s an MoE+Transformer+Mamba hybrid so I’m not sure how that would work with Galore.

thoughts/pointers? @jiaweizzhao @agnim25 @darthjaja6