issues
search
alibaba
/
Pai-Megatron-Patch
The official repo of Pai-Megatron-Patch for LLM & VLM large scale training developed by Alibaba Cloud.
Apache License 2.0
674
stars
94
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
fix hang when tp bs both > 1 during sft
#309
lwmlyy
closed
2 months ago
0
Flash-Attn 3的支持
#308
echo-valor
opened
2 months ago
1
打扰了,提个关于多机训练的issues
#307
CallmeZhangChenchen
closed
2 months ago
4
是否支持sharegpt格式数据?或者带"history"字段的多轮对话数据?
#306
jiejie1993
opened
2 months ago
1
[rank31]: OSError: error stat()ing file 数据集map问题
#305
shyzzz521
opened
2 months ago
0
钉钉群满了
#304
divisionblur
closed
1 month ago
5
Missing key(s) in state_dict llama3 mcore转换后权重不匹配
#303
wuduher
closed
2 months ago
5
OSError: [Errno 28] No space left on device 请教
#302
shyzzz521
closed
1 month ago
2
bigcode-evaluation-harness 这个仓库应该是没有了
#301
CallmeZhangChenchen
closed
2 months ago
2
seq len开大时,初始loss不正常
#300
Jayce1kk
closed
1 month ago
3
Fix bug of memstats and add scripts
#299
lostkevin
closed
2 months ago
0
Add auto optimizer offloading and update ReadMe
#298
lostkevin
closed
2 months ago
0
enable finetune with idxmap datasets
#297
jerryli1981
closed
2 months ago
1
Qwen2 0.5B 和 1.5B的模型是否应该将这个参数去掉?
#296
MrWaterZhou
closed
1 month ago
1
QwenTokenizer与Qwen2Tokenizer
#295
sexan
opened
2 months ago
3
nvcr.io/nvidia/pytorch:23.12-py3镜像包冲突
#294
wuduher
closed
2 months ago
1
[rank2]: AttributeError: 'IndexedDataset' object has no attribute 'sizes'
#293
wccccp
opened
2 months ago
1
Add Chunking for Static Offloading Policy and Add Safetensor Format for Mcore2HF Convert
#292
jerryli1981
closed
2 months ago
1
Update CPU Offload Optimizer
#291
jerryli1981
closed
2 months ago
1
Update CPU Offload Optimizer
#290
jerryli1981
closed
2 months ago
1
qwen2 MG and HF mismatch
#289
vlad-karpuhin
opened
2 months ago
0
update submodule to cf16caf
#288
lostkevin
closed
2 months ago
0
模型转换显存占用问题
#287
coder-wangzhen
closed
2 months ago
2
Update Megatron-LM-240705-Performance-Booster
#286
jerryli1981
closed
2 months ago
1
qwen2-7b problem when tp=2, pp=1
#285
MrWaterZhou
closed
2 months ago
6
[BUG] `layer_number` 参数无法解析
#284
cingtiye
closed
2 months ago
0
Add MPI Support for tp-comm-overlap and Cpu-Offload for Mcore Distrib…
#283
jerryli1981
closed
2 months ago
1
qwen-moe-megablocks权重转换问题
#282
yingzhao27
opened
2 months ago
1
Fix llama3 70b convert
#281
lee0ray
closed
3 months ago
1
转换权重的问题
#280
Jayce1kk
closed
2 months ago
3
qwen2模型切片为什么num_query_groups必须大于target_tensor_model_parallel_size
#279
jianhai0527
opened
3 months ago
0
Qwen1.5 SFT阶段用的数据格式是LLama-Pretrain-Raw?
#278
cocaer
closed
3 months ago
1
请问Qwen2是否支持Sparse Upcycling的方式将dense 转换为moe
#277
zTaoplus
closed
2 weeks ago
0
数据预处理的脚本能在mac上运行吗,无法编译
#276
shine10076
closed
1 month ago
1
Fix Qwen2 MoE Loss Convergence Issue
#275
jerryli1981
closed
3 months ago
1
Add Floating Point Control for Qwen2 Model Convertor
#274
jerryli1981
closed
3 months ago
1
Enhance Qwen2 Model Finetune Script and Make Qwen2Tokenzier callable
#273
jerryli1981
closed
3 months ago
1
Add Qwen2-MoE Pipeline Parallel Model Convertor
#272
jerryli1981
closed
3 months ago
1
qwen2 72b state_dict mismatch with TE
#271
getao
closed
3 months ago
5
Update Qwen2 ReadMe
#270
jerryli1981
closed
3 months ago
0
Fix Qwen2 MoE hf2mcore convertor via TE
#269
jerryli1981
closed
3 months ago
1
add pipeline parallel convertor from mcore to huggingface
#268
camille1874
closed
3 months ago
1
Add Qwen2 MoE Model Mcore Implementation
#267
jerryli1981
closed
3 months ago
1
Qwen1&Qwen1.5&Qwen2 tokenizer fix
#266
Carrie-Yi
closed
3 months ago
1
Fix Qwen2-7B Model Convertor Extra Vocab Size to 421
#265
jerryli1981
closed
3 months ago
1
Update Qwen2 ReadMe and Fix GQA Model Convert Issue
#264
jerryli1981
closed
3 months ago
1
Update Qwen2 ReadMe and Fix GQA Model Convert Issue
#263
jerryli1981
closed
3 months ago
1
qwen2-7b 开启 context parallel 后,logits中对应padding 的位置为 nan
#262
WallE-Chang
opened
3 months ago
0
hf2mcore_qwen1.5_dense_mha_to_moe.py 有个逻辑不太懂的地方
#261
steins048596
opened
3 months ago
0
Fix Qwen2-7B Extra Vocab Size to 421
#260
jerryli1981
closed
3 months ago
1
Previous
Next