alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674 stars 94 forks source link

starcoder依赖哪个版本的megatron-lm? #314

Closed bao-xiaoyi closed 1 month ago

bao-xiaoyi commented 2 months ago

如题,目前存在依赖问题。 另外,支持starcoder2吗

bao-xiaoyi commented 1 month ago

依赖问题已解决

bao-xiaoyi commented 1 month ago

此外,deepseekcoder-v2转换成megatron格式后(能够转换成功),模型参数量和deepseek-v2不一样导致训练加载模型失败。目前原因未知

jerryli1981 commented 1 month ago

您好,有进一步的问题可以在群里发下deepseek-v2的错误提示哈