CoinCheung / gdGPT

Train llm (bloom, llama, baichuan2-7b, chatglm3-6b) with deepspeed pipeline mode. Faster than zero/zero++/fsdp.
Apache License 2.0
90 stars 8 forks source link

support mixtral-8x7b, and a bit fix to match new version transformers #29

Closed CoinCheung closed 6 months ago