issues
search
HuangLK
/
transpeeder
train llama on a single A100 80G node using 🤗 transformers and 🚀 Deepspeed Pipeline Parallelism
Apache License 2.0
208
stars
18
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Can the use of models other than LLAMA be supported?
#45
ojipadeson
opened
3 months ago
1
fix dp issue and update convert script
#44
HuangLK
closed
11 months ago
0
拆分成4个分片模型,如何设置每个GPU只加载一个分片
#43
yangzhipeng1108
opened
11 months ago
1
Refine dp
#42
JY-Ren
closed
11 months ago
0
size mismatch
#41
sysuprophet
opened
1 year ago
1
New dev
#40
HuangLK
closed
1 year ago
0
Add ntk, flash-attn2 and support llama2
#39
JY-Ren
closed
1 year ago
0
ImportError: cannot import name 'flash_attn_unpadded_qkvpacked_func' from 'flash_attn.flash_attn_interface'
#38
BastianChen
closed
1 year ago
3
error when use zero1
#37
bebory
opened
1 year ago
1
Flash attention integration failed
#36
SparkJiao
opened
1 year ago
0
How to set distributed sampler when using hybrid training of pipeline parallelism and data parallel
#35
SparkJiao
closed
1 year ago
2
RuntimeError: element 1 of tensors does not require grad and does not have a grad_fn
#34
SparkJiao
closed
1 year ago
1
pipeline model的一些问题
#33
lyzKF
closed
1 year ago
2
how can run it with 24G GPU card like 3090
#32
SeekPoint
opened
1 year ago
2
flash_attn_cuda.cpython-38-x86_64-linux-gnu.so: undefined symbol
#31
SeekPoint
closed
1 year ago
0
Why do we need to add 1 to the vocab_size when constructing the model?
#30
forceshorty
opened
1 year ago
2
Output is not getting saved
#29
dittops
opened
1 year ago
1
File not found error
#28
AlvL1225
closed
1 year ago
3
attention mask
#27
zhhao1
closed
1 year ago
5
请问现在还不支持张量并行么?只支持流水线并行和数据并行?
#26
ezioliao
opened
1 year ago
2
请问有计划支持peft吗
#25
zhangsanfeng86
opened
1 year ago
1
hidden_states=bool变量
#24
iMountTai
opened
1 year ago
1
support bf16?
#23
lw3259111
opened
1 year ago
4
Running 7b succeed. next 30B
#22
hudengjunai
opened
1 year ago
10
train.py中加载checkpoint似乎没效
#21
GongCQ
closed
1 year ago
0
四卡训7B-llama清空缓存再训练报错
#20
Ulov888
closed
1 year ago
1
使用4块3090全量微调7B-llama时发现启动训练阶段非常缓慢
#19
Ulov888
closed
1 year ago
1
模型加载
#18
zhhao1
closed
1 year ago
1
Optimize log and update requriremnet.txt
#17
muou55555
closed
1 year ago
0
change dataset meet StopIteration
#16
huiyangzhou
closed
1 year ago
0
update readme
#15
HuangLK
closed
1 year ago
0
use LayerSpec
#14
HuangLK
closed
1 year ago
0
fix incomplete micro batch
#13
HuangLK
closed
1 year ago
0
remove unused arg
#12
HuangLK
closed
1 year ago
0
add flash-attn
#11
HuangLK
closed
1 year ago
0
upgrade transformers version
#10
HuangLK
closed
1 year ago
0
TypeError: 'NoneType' object is not subscriptable
#9
huiyangzhou
closed
1 year ago
3
训练时,loss =nan
#8
zhangsanfeng86
closed
1 year ago
5
关于batchsize问题
#7
zhhao1
closed
1 year ago
4
生成的模型,怎么转为huggleface格式呢
#6
zhangsanfeng86
closed
1 year ago
2
请问这个错误,是transformer的版本问题吗?
#5
zhangsanfeng86
closed
1 year ago
4
loss很快降为0
#4
muou55555
closed
1 year ago
19
add activation_checkpointing
#3
HuangLK
closed
1 year ago
0
update readme
#2
HuangLK
closed
1 year ago
0
vanilla version
#1
HuangLK
closed
1 year ago
0