THUDM / CodeGeeX

CodeGeeX: An Open Multilingual Code Generation Model (KDD 2023)
https://codegeex.cn
Apache License 2.0
8k stars 576 forks source link

Pretraining performance on Megatron #108

Open Amandaynzhou opened 1 year ago

Amandaynzhou commented 1 year ago

Hi all!

Since the initial training is based on Mindspore, I'm wondering if there is any training result for the first stage on the Megatron.

pangsg commented 1 year ago

Hi all!

Since the initial training is based on Mindspore, I'm wondering if there is any training result for the first stage on the Megatron.

Hello, i'm wondering how to initial training on Mindspore, which script should be executed? Thanks for your help!