issues
search
OpenBMB
/
BMTrain
Efficient Training (including pre-training and fine-tuning) for Big Models
Apache License 2.0
560
stars
77
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Offload activation async support
#156
MayDomine
opened
1 year ago
0
Fix middle hidden
#155
zkh2016
closed
1 year ago
0
mv zero_level to CheckpointBlock
#154
zkh2016
closed
1 year ago
1
Tensor Parallel
#153
zkh2016
closed
1 year ago
0
fix is_first_layer
#152
zkh2016
closed
1 year ago
0
Fix inspect model when param is None
#151
zkh2016
closed
1 year ago
0
model中存在Linear(config.hidden_size, config.vocab_size, bias=False)时候,print_inspect(model, "*")会报错。
#150
jinmin527
closed
1 year ago
1
Support Tensor Parallel
#149
zkh2016
closed
1 year ago
0
用BMTrainModelWrapper封装大模型的问题
#148
jinmin527
closed
1 year ago
4
Add Reduce Scatter communication op.
#147
MayDomine
closed
1 year ago
0
FIX Error: tensor slice in gather()
#146
JerryYin777
closed
1 year ago
0
gather result存在潜在问题
#145
zkh2016
closed
1 year ago
1
Features yml issue
#144
MayDomine
closed
1 year ago
0
ISSUE TEMPLATE
#143
MayDomine
closed
1 year ago
0
gather result存在潜在问题
#142
zkh2016
closed
1 year ago
0
Add a issue template of bug report
#141
zkh2016
closed
1 year ago
0
gather_reuslt存在潜在问题
#140
zkh2016
closed
1 year ago
1
fix nccl import
#139
MayDomine
closed
1 year ago
0
BMTrain v0.2.3.post2
#138
MayDomine
closed
1 year ago
0
Error when pip install bmtrain
#137
HBX-hbx
closed
1 year ago
2
Add Bf16 Support
#136
Achazwl
closed
1 year ago
0
New release
#135
MayDomine
closed
1 year ago
0
Add bf16 Support to Adam
#134
JerryYin777
closed
1 year ago
1
0.2.3 release
#133
MayDomine
closed
1 year ago
0
Fix parallel_for
#132
Achazwl
closed
1 year ago
0
我们以后能否和spark-gpu一起配合使用,开发 java 、scala . c++ 版本的bmtrain
#131
mullerhai
opened
1 year ago
1
now supoort pyproject.toml
#130
MayDomine
closed
1 year ago
0
feat: min-max constrain of OptimManager's loss scale
#129
Achazwl
closed
1 year ago
0
Refactor ZeRO, checkpoint and pipeline code
#128
zkh2016
closed
1 year ago
0
[WIP]Using hooks to implement ZeRO and Checkpoint
#127
zkh2016
closed
1 year ago
0
remove inappropriate import in __init__.py
#126
Achazwl
closed
1 year ago
0
TypeError: expected string or bytes-like object
#125
dage0127
closed
1 year ago
1
How to distribute weights to different GPUs?
#124
w32zhong
closed
1 year ago
2
安装BMTranin失败:nccl.obj : error LNK2001: XXXX
#123
ShuaiLing-Shao
closed
1 year ago
1
Make Checkpointing Optional
#122
MayDomine
closed
1 year ago
0
bmt.load(model) -> Unexpected OOM
#121
MayDomine
closed
1 year ago
0
Adam offloading thread bugs
#120
MayDomine
closed
1 year ago
0
BMTrain setup without torch
#119
MayDomine
closed
1 year ago
0
模型加载
#118
ftgreat
closed
1 year ago
1
安装成功,但import失败,bmtrain版本0.2.2
#117
Mandy0016
closed
1 year ago
2
自己组装问答数据微调的loss下降非常慢
#116
deerluffy
closed
1 year ago
1
BMTrain v0.2.3
#115
MayDomine
closed
1 year ago
0
bmtrain compiling without torch
#114
ithyl
closed
1 year ago
34
安装成功,但是导入失败, 无法从 bmtrain.nccl 中导入 'C_'
#113
YingLaiLin
closed
1 year ago
2
你好 安装报错 linux
#112
xiaoguaishoubaobao
closed
1 year ago
1
fix: missing nccl
#111
benkerd22
closed
1 year ago
1
cuda11.7 ,torch==1.13.1,ubuntu22.04版本下安装失败?
#110
xiaohaihui-smart
closed
1 year ago
2
cuda11.8安装bmtrain失败
#109
monarchwise
closed
1 year ago
2
how to skip one iter for all ranks
#108
ftgreat
closed
1 year ago
9
windows 安装失败
#107
wilderchen
closed
1 year ago
2
Previous
Next