microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2
Other
1.83k stars 337 forks source link

屎山代码DeepSpeed #386

Open ControllableGeneration opened 4 months ago

ControllableGeneration commented 4 months ago

如题

Please, spend time cleaning the code, DS team!!! The code is really hard to modify!!!!

ControllableGeneration commented 4 months ago

Plus, why for PipelineEngine, there is no forward function? eval_batch method is really hard to use and not portable at all!

LLMChild commented 4 months ago

Try this work :https://github.com/OpenBMB/BMTrain.git

loadams commented 1 month ago

Thanks @ControllableGeneration - we are going to work on improving this codebase.