alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674 stars 94 forks source link

请问要在自己的集群上跑pai-megatron框架,环境配置需要自己配哪些东西?有相关步骤可以参考么? #249

Closed qibao77 closed 3 months ago

jerryli1981 commented 3 months ago

您好,可以参考:https://mp.weixin.qq.com/s?__biz=Mzg4MzgxNDk2OA==&mid=2247491796&idx=1&sn=dc1d719313d794ae1aacdb07669a9545&chksm=cf430783f8348e950218bfcff861a2e6d2d92705807bf5b04f6e9268cc510ffa6e6aa2c87327#rd

另外在自己的集群上跑和跑多机Pytorch是一样的

yuanzhiyong1999 commented 1 week ago

请问你装好环境了吗