HuangLK transpeeder issues

HuangLK / transpeeder

train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism

Apache License 2.0

208 stars 18 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Can the use of models other than LLAMA be supported?

#45 ojipadeson opened 3 months ago
1
fix dp issue and update convert script

#44 HuangLK closed 11 months ago
0
拆分成4个分片模型，如何设置每个GPU只加载一个分片

#43 yangzhipeng1108 opened 11 months ago
1
Refine dp

#42 JY-Ren closed 11 months ago
0
size mismatch

#41 sysuprophet opened 1 year ago
1
New dev

#40 HuangLK closed 1 year ago
0
Add ntk, flash-attn2 and support llama2

#39 JY-Ren closed 1 year ago
0
ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface'

#38 BastianChen closed 1 year ago
3
error when use zero1

#37 bebory opened 1 year ago
1
Flash attention integration failed

#36 SparkJiao opened 1 year ago
0
How to set distributed sampler when using hybrid training of pipeline parallelism and data parallel

#35 SparkJiao closed 1 year ago
2
RuntimeError: element 1 of tensors does not require grad and does not have a grad_fn

#34 SparkJiao closed 1 year ago
1
pipeline model的一些问题

#33 lyzKF closed 1 year ago
2
how can run it with 24G GPU card like 3090

#32 SeekPoint opened 1 year ago
2
flash_attn_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol

#31 SeekPoint closed 1 year ago
0
Why do we need to add 1 to the vocab_size when constructing the model?

#30 forceshorty opened 1 year ago
2
Output is not getting saved

#29 dittops opened 1 year ago
1
File not found error

#28 AlvL1225 closed 1 year ago
3
attention mask

#27 zhhao1 closed 1 year ago
5
请问现在还不支持张量并行么？只支持流水线并行和数据并行？

#26 ezioliao opened 1 year ago
2
请问有计划支持peft吗

#25 zhangsanfeng86 opened 1 year ago
1
hidden_states=bool变量

#24 iMountTai opened 1 year ago
1
support bf16？

#23 lw3259111 opened 1 year ago
4
Running 7b succeed. next 30B

#22 hudengjunai opened 1 year ago
10
train.py中加载checkpoint似乎没效

#21 GongCQ closed 1 year ago
0
四卡训7B-llama清空缓存再训练报错

#20 Ulov888 closed 1 year ago
1
使用4块3090全量微调7B-llama时发现启动训练阶段非常缓慢

#19 Ulov888 closed 1 year ago
1
模型加载

#18 zhhao1 closed 1 year ago
1
Optimize log and update requriremnet.txt

#17 muou55555 closed 1 year ago
0
change dataset meet StopIteration

#16 huiyangzhou closed 1 year ago
0
update readme

#15 HuangLK closed 1 year ago
0
use LayerSpec

#14 HuangLK closed 1 year ago
0
fix incomplete micro batch

#13 HuangLK closed 1 year ago
0
remove unused arg

#12 HuangLK closed 1 year ago
0
add flash-attn

#11 HuangLK closed 1 year ago
0
upgrade transformers version

#10 HuangLK closed 1 year ago
0
TypeError: 'NoneType' object is not subscriptable

#9 huiyangzhou closed 1 year ago
3
训练时，loss =nan

#8 zhangsanfeng86 closed 1 year ago
5
关于batchsize问题

#7 zhhao1 closed 1 year ago
4
生成的模型，怎么转为huggleface格式呢

#6 zhangsanfeng86 closed 1 year ago
2
请问这个错误，是transformer的版本问题吗？

#5 zhangsanfeng86 closed 1 year ago
4
loss很快降为0

#4 muou55555 closed 1 year ago
19
add activation_checkpointing

#3 HuangLK closed 1 year ago
0
update readme

#2 HuangLK closed 1 year ago
0
vanilla version

#1 HuangLK closed 1 year ago
0