alibaba Pai-Megatron-Patch issues

alibaba / Pai-Megatron-Patch

The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.

Apache License 2.0

674 stars 94 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

fix qwen1.5-megatron and add 32b

#209 lwmlyy closed 5 months ago
0
Fix finetune with mcore training scripts

#208 jerryli1981 closed 5 months ago
1
Enhance llama3-8b megatron training process

#207 jerryli1981 closed 5 months ago
1
llama3

#206 yangzhipeng1108 closed 5 months ago
0
Fix Qwen1.5 32B gqa convertor

#205 jerryli1981 closed 5 months ago
1
Fix qwen/llama/mistral gqa convertor

#204 jerryli1981 closed 5 months ago
1
Update data preprocessing readme

#203 jerryli1981 closed 5 months ago
1
Update data preprocessing readme

#202 jerryli1981 closed 5 months ago
1
Update data preprocessing readme

#201 jerryli1981 closed 5 months ago
1
Fix LLama3 shell script

#200 jerryli1981 closed 5 months ago
1
Fix Qwen1.5 typo issue

#199 jerryli1981 closed 5 months ago
1
Support Qwen1.5 32B and 72B for mcore training

#198 jerryli1981 closed 5 months ago
1
Fix LLama3 70B Convertor

#197 jerryli1981 closed 5 months ago
1
Fix Mixtral Convertor

#196 jerryli1981 closed 5 months ago
1
Update ReadMe for qwen1.5

#195 jerryli1981 closed 5 months ago
1
Update ReadMe for LLama3

#194 jerryli1981 closed 5 months ago
1
Update Quick Starts for LLama3

#193 jerryli1981 closed 5 months ago
1
Change qwen convertor name

#192 jerryli1981 closed 5 months ago
1
Update Quick Starts

#191 jerryli1981 closed 5 months ago
1
Update Quick Starts

#190 jerryli1981 closed 5 months ago
1
请问 mixtral的finetune样例依赖的是哪个版本的Megatron-LM

#189 cryoco closed 5 months ago
5
请问本地部署多机训练qwen72b，怎么配置，或者如何做多机通信

#188 yangzhipeng1108 closed 5 months ago
2
Feature/load balance add expert replacement feature for MoE model(mixtral)

#187 uygnef opened 5 months ago
1
Update MegaBlocks and Quick Starts

#186 jerryli1981 closed 5 months ago
1
Fix Mistral pretraining when PP>1

#185 jerryli1981 closed 5 months ago
1
finetune qwen1.4-4B with tp=2, failed when load model with embedding_shape.

#184 hudengjunai closed 5 months ago
3
Mistral持续预训练的脚本是否支持pp>=2？

#183 kobayashikanna01 closed 5 months ago
4
Update Quick Starts

#182 jerryli1981 closed 5 months ago
1
Updata Megatron-LM Soft Link

#181 jerryli1981 closed 5 months ago
1
Adapt tokenizer init for old megatron lm version

#180 jerryli1981 closed 5 months ago
1
Please update Falcon :)

#179 pharaouk closed 5 months ago
8
Fix qwen1.5 hf2mcore_v2 convertor

#178 jerryli1981 closed 5 months ago
1
Optimizer Quick Start docs and Enhance Qwen1.5 mcore pretraining

#177 jerryli1981 closed 5 months ago
1
Update Llama2 and Mistral Readme

#176 jerryli1981 closed 5 months ago
1
Update Llama2 and Mistral Readme

#175 jerryli1981 closed 5 months ago
1
Update Llama2 and Mistral Readme

#174 jerryli1981 closed 5 months ago
1
fix qwen14b convert and pretrain

#173 lk137095576 closed 5 months ago
0
存在一些bug，目前mistral llama13 sft能跑通-已解决

#172 zhl5842 closed 5 months ago
0
fix spelling error in the readme

#171 SherLockOD closed 5 months ago
1
fix spelling error in the readme

#170 SherLockOD closed 5 months ago
1
fix qwen convert and pretrain bash

#169 lk137095576 closed 5 months ago
1
fix qwen pretrain bash bug

#168 lk137095576 closed 5 months ago
1
qwen/hf2mcore_1.5_v2.py将hf转为mcore格式，当tp大于1时会报错

#167 cdj0311 closed 5 months ago
2
微调llama时遇到的的bug

#166 Emperorizzis closed 5 months ago
3
When will support for converting Megatron models to Hugging Face model with pipeline parallelism enabled be available for Qwen1.5 models?

#165 kaiwang13 closed 5 months ago
5
Will Pai-Megatron-Patch support PEFT and quantization?

#164 haolin-nju closed 5 months ago
1
how to install the training envirenments?

#163 alphanlp closed 5 months ago
1
fix import error in lm_evaluate.py

#162 lihengtao closed 6 months ago
2
pretrain_mcore_llama.py 报错

#161 CaesarWWK closed 5 months ago
0
Remove idx=0 from data llama

#160 jerryli1981 closed 6 months ago
1

Previous Next