AI-Hypercomputer / maxtext

A simple, performant and scalable Jax LLM!
Apache License 2.0
1.55k stars 297 forks source link

Support LoRA training #609

Open hxssgaa opened 7 months ago

hxssgaa commented 7 months ago

Is there a plan to support PEFT methods like LoRA training in maxtext to support larger model fine-tuning / continue pretraining so that bigger models like LLaMA-3-70B can be trainined even with small amount of TPU/GPUs?

sbhavani commented 5 months ago

Any updates on when LoRA support would be available?

gobbleturk commented 3 months ago

This is on our roadmap with high priority, wil update here once we start working on it