issues
search
kojimano
/
Megatron-DeepSpeed-ABCI
Other
5
stars
2
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
created README
#32
kojimano
closed
1 year ago
0
Scale 544gpus
#31
kojimano
closed
1 year ago
0
Update replace_breaks.py
#30
kuriyan1204
closed
1 year ago
0
Update replace_breaks.py
#29
kuriyan1204
closed
1 year ago
0
Addline break replacer
#28
kuriyan1204
closed
1 year ago
0
Add `\n` replace script
#27
kuriyan1204
closed
1 year ago
1
Add github dataset from redpajama
#26
kuriyan1204
opened
1 year ago
1
Debug data processing pipelines
#25
kuriyan1204
opened
1 year ago
2
Tokenizer周りについて
#24
keisks
opened
1 year ago
0
Adding Abeja Japanese Tokenizer into a pipeline.
#23
kojimano
closed
1 year ago
0
CyberAgent Preprocessing / Binarize Data
#22
kojimano
closed
1 year ago
0
Dataset Preprocessing Validation
#21
kojimano
opened
1 year ago
0
General Preprocessing Pipeline
#20
kojimano
opened
1 year ago
0
ABCI 544 GPU rehersal
#19
kojimano
closed
1 year ago
0
青空文庫 Preprocessing / Binarize Data
#18
kojimano
closed
1 year ago
1
Wikipedia Preprocessing / Binarize Data
#17
kojimano
closed
1 year ago
1
ABCI GPU ベンチマーク
#16
kojimano
closed
1 year ago
1
Sambanovaで使うデータセットの用意
#15
losyer
closed
1 year ago
0
タイムライン
#14
keisks
closed
1 year ago
0
instruction tuningについて
#13
keisks
opened
1 year ago
4
SambaNovaのサーバーで何を学習するか
#12
losyer
closed
1 year ago
0
SambaNovaの学習でどのデータを使うか
#11
losyer
closed
1 year ago
1
前処理はどこでなにをやるか
#10
losyer
closed
1 year ago
2
データの置き場所
#9
losyer
closed
1 year ago
3
学習データの最終的な固め方
#8
losyer
closed
1 year ago
0
聞いておきたいこと
#7
losyer
closed
1 year ago
1
CACCのtokenizeについて確認
#6
losyer
closed
1 year ago
1
stats of CACC data
#5
losyer
closed
1 year ago
2
ABCIグランドチャレンジでどのデータを使うか決める
#4
losyer
closed
1 year ago
5
data_news_articles
#3
kojimano
closed
1 year ago
2
evaluation_jglue
#2
kojimano
opened
1 year ago
2
model_tokenizer
#1
kojimano
closed
1 year ago
2