issues
search
hpcaitech
/
ColossalAI-Examples
Examples of training models with hybrid parallelism using ColossalAI
Apache License 2.0
334
stars
102
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
[example] add example requirement
#208
binmakeswell
closed
1 year ago
0
connection failure
#207
lhj-git
opened
1 year ago
2
cannot import name 'OPTForCausalLM'
#206
upwindflys
opened
1 year ago
0
Test/test gpt2 performance
#205
YuliangLiu0306
closed
1 year ago
1
grad is none when run gpt2 with pipeline parallelism only
#204
lin88lin8850
opened
1 year ago
0
Outdated OPT example
#203
larry-fuy
opened
1 year ago
0
[Pipeline Middleware] Add an OPT Pipeline Example
#202
Wesley-Jzy
closed
1 year ago
0
CVE-2007-4559 Patch
#201
TrellixVulnTeam
opened
1 year ago
0
[autoparallel] update auto parallel demo
#200
YuliangLiu0306
opened
1 year ago
0
Load ColossalAI GPT model as HuggingFace/Transformers Model
#199
Red-Giuliano
opened
1 year ago
2
Update LICENSE
#198
binmakeswell
closed
1 year ago
0
detr-debug pipelinable.py
#197
lsx66
opened
1 year ago
1
there maybe some bug about the train_gpt.py(https://github.com/hpcaitech/ColossalAI-Examples/blob/main/language/gpt/train_gpt.py)
#196
lambda7xx
opened
1 year ago
5
It seems the pipeline parallel document is out of date(https://www.colossalai.org/docs/features/pipeline_parallel)
#195
lambda7xx
opened
1 year ago
7
remove redundant tutorial files
#194
binmakeswell
closed
1 year ago
0
question about import model_zoo.gpt.gpt as col_gpt
#193
lambda7xx
opened
1 year ago
2
[sc] handson6 for auto activation checkpoint.
#192
super-dainiu
closed
1 year ago
2
[sc demo] add log to autoparallel demo
#191
YuliangLiu0306
closed
1 year ago
0
add ColoDiffusion code: part1
#190
MaruyamaAya
closed
1 year ago
0
[resnet] autoparallel resnet demo
#189
YuliangLiu0306
closed
1 year ago
0
[diffusion] initialize a directory
#188
feifeibear
closed
1 year ago
0
Add Handsons to ColossalAI-Examples
#187
BoxiangW
closed
1 year ago
0
ImportError running detr
#186
LSC527
opened
1 year ago
1
[zero] fix example in features/zero
#185
1SAA
closed
1 year ago
0
debug zero exmaple
#184
feifeibear
closed
1 year ago
1
add RoBERTa
#183
mandoxzhang
closed
1 year ago
0
add version for example code
#182
feifeibear
closed
1 year ago
0
Add OPT Pipeline
#181
Wesley-Jzy
closed
1 year ago
0
The error happened when I did multi-node distributed training
#180
ShangWeiKuo
opened
1 year ago
1
[rpc_pipeline] add baseline for gpt-2 | hidden dataset path
#179
LSTM-Kirigaya
closed
1 year ago
0
[rpc_pipeline] add rpc example in features/pipeline_parallel | add gp…
#178
LSTM-Kirigaya
closed
1 year ago
0
[moe] adapt moe example to the newest version
#177
1SAA
closed
2 years ago
0
RuntimeError: CUDA out of memory with cifar10 in data_parallel example
#176
fuhengwu2021
closed
2 years ago
1
ImportError: cannot import name 'colo_state_dict' from 'colossalai.utils.model.colo_init_context'
#175
fuhengwu2021
opened
2 years ago
1
Update amp README: add missing dependency `titans`
#174
ofey404
closed
2 years ago
0
[requirements] modify requirements.txt
#173
Cypher30
closed
2 years ago
0
BERT Data Preprocessing
#172
JizeZhangCS
opened
2 years ago
3
Cannot find the gradient handler example
#171
DarrenYing
opened
2 years ago
1
[colotensor] add megatron example via colotensor
#170
1SAA
closed
2 years ago
1
当模型gradient_checkpointing时运行feature/zero/train_v2.py出错
#169
wjizhong
closed
2 years ago
5
[zero] update zero example
#168
ver217
closed
2 years ago
0
fix opt model init
#167
ver217
closed
2 years ago
0
[Compatibility] Runining OPT using PyTorch 1.12 and Gemini placement_policy = 'cuda' failed
#166
feifeibear
opened
2 years ago
3
[bert] update the zero pretraining and finetuning with new zero
#165
FrankLeeeee
closed
1 year ago
0
polish OPT code
#164
feifeibear
closed
2 years ago
0
[opt] fix bugs, polish code
#163
1SAA
closed
2 years ago
0
[hotfix] fix #158 import error for partition_uniform
#162
feifeibear
closed
2 years ago
0
运行GPT2案例出现RuntimeError: Could not find 'SLURM_PROCID'问题,是必须要装SLURM环境?
#161
ZXM1063694570
opened
2 years ago
1
[opt] add init_in_cpu
#160
1SAA
closed
2 years ago
0
[opt] migrate OPT model to ColoTensor API
#159
1SAA
closed
2 years ago
0
Next