AFDWang / Hetu-Galvatron

Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu-Galvatron
11 stars 4 forks source link

Galvatron version update. #3

Closed Fizzmy closed 3 months ago

Fizzmy commented 3 months ago
  1. Update search engine.
  2. Update Megatron version (bcce6f).
  3. Add llama_hf, modify llama_fa.
  4. Support Megatron-SP, adapt to gpt_hf, llama_hf.
  5. Support vocab parallel, adapt to gpt_hf.