Open mx8435 opened 1 year ago
Hi,Is the GLM model in chatglm-6b pretrained on this repo by just modifying some tricks (such as add rope embedding, use new tokenizer), or use the GLM-130b based repo which can use megatron model-parallel. Thanks.
I also want to know the solution about this topic
Hi,Is the GLM model in chatglm-6b pretrained on this repo by just modifying some tricks (such as add rope embedding, use new tokenizer), or use the GLM-130b based repo which can use megatron model-parallel. Thanks.