issues
search
alibaba
/
Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674
stars
94
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
fix qwen1.5-megatron and add 32b
#209
lwmlyy
closed
5 months ago
0
Fix finetune with mcore training scripts
#208
jerryli1981
closed
5 months ago
1
Enhance llama3-8b megatron training process
#207
jerryli1981
closed
5 months ago
1
llama3
#206
yangzhipeng1108
closed
5 months ago
0
Fix Qwen1.5 32B gqa convertor
#205
jerryli1981
closed
5 months ago
1
Fix qwen/llama/mistral gqa convertor
#204
jerryli1981
closed
5 months ago
1
Update data preprocessing readme
#203
jerryli1981
closed
5 months ago
1
Update data preprocessing readme
#202
jerryli1981
closed
5 months ago
1
Update data preprocessing readme
#201
jerryli1981
closed
5 months ago
1
Fix LLama3 shell script
#200
jerryli1981
closed
5 months ago
1
Fix Qwen1.5 typo issue
#199
jerryli1981
closed
5 months ago
1
Support Qwen1.5 32B and 72B for mcore training
#198
jerryli1981
closed
5 months ago
1
Fix LLama3 70B Convertor
#197
jerryli1981
closed
5 months ago
1
Fix Mixtral Convertor
#196
jerryli1981
closed
5 months ago
1
Update ReadMe for qwen1.5
#195
jerryli1981
closed
5 months ago
1
Update ReadMe for LLama3
#194
jerryli1981
closed
5 months ago
1
Update Quick Starts for LLama3
#193
jerryli1981
closed
5 months ago
1
Change qwen convertor name
#192
jerryli1981
closed
5 months ago
1
Update Quick Starts
#191
jerryli1981
closed
5 months ago
1
Update Quick Starts
#190
jerryli1981
closed
5 months ago
1
请问 mixtral的finetune样例依赖的是哪个版本的Megatron-LM
#189
cryoco
closed
5 months ago
5
请问本地部署多机训练qwen72b,怎么配置,或者如何做多机通信
#188
yangzhipeng1108
closed
5 months ago
2
Feature/load balance add expert replacement feature for MoE model(mixtral)
#187
uygnef
opened
5 months ago
1
Update MegaBlocks and Quick Starts
#186
jerryli1981
closed
5 months ago
1
Fix Mistral pretraining when PP>1
#185
jerryli1981
closed
5 months ago
1
finetune qwen1.4-4B with tp=2, failed when load model with embedding_shape.
#184
hudengjunai
closed
5 months ago
3
Mistral持续预训练的脚本是否支持pp>=2?
#183
kobayashikanna01
closed
5 months ago
4
Update Quick Starts
#182
jerryli1981
closed
5 months ago
1
Updata Megatron-LM Soft Link
#181
jerryli1981
closed
5 months ago
1
Adapt tokenizer init for old megatron lm version
#180
jerryli1981
closed
5 months ago
1
Please update Falcon :)
#179
pharaouk
closed
5 months ago
8
Fix qwen1.5 hf2mcore_v2 convertor
#178
jerryli1981
closed
5 months ago
1
Optimizer Quick Start docs and Enhance Qwen1.5 mcore pretraining
#177
jerryli1981
closed
5 months ago
1
Update Llama2 and Mistral Readme
#176
jerryli1981
closed
5 months ago
1
Update Llama2 and Mistral Readme
#175
jerryli1981
closed
5 months ago
1
Update Llama2 and Mistral Readme
#174
jerryli1981
closed
5 months ago
1
fix qwen14b convert and pretrain
#173
lk137095576
closed
5 months ago
0
存在一些bug,目前mistral llama13 sft能跑通-已解决
#172
zhl5842
closed
5 months ago
0
fix spelling error in the readme
#171
SherLockOD
closed
5 months ago
1
fix spelling error in the readme
#170
SherLockOD
closed
5 months ago
1
fix qwen convert and pretrain bash
#169
lk137095576
closed
5 months ago
1
fix qwen pretrain bash bug
#168
lk137095576
closed
5 months ago
1
qwen/hf2mcore_1.5_v2.py将hf转为mcore格式,当tp大于1时会报错
#167
cdj0311
closed
5 months ago
2
微调llama时遇到的的bug
#166
Emperorizzis
closed
5 months ago
3
When will support for converting Megatron models to Hugging Face model with pipeline parallelism enabled be available for Qwen1.5 models?
#165
kaiwang13
closed
5 months ago
5
Will Pai-Megatron-Patch support PEFT and quantization?
#164
haolin-nju
closed
5 months ago
1
how to install the training envirenments?
#163
alphanlp
closed
5 months ago
1
fix import error in lm_evaluate.py
#162
lihengtao
closed
6 months ago
2
pretrain_mcore_llama.py 报错
#161
CaesarWWK
closed
5 months ago
0
Remove idx=0 from data llama
#160
jerryli1981
closed
6 months ago
1
Previous
Next