JonasGeiping cramming issues

JonasGeiping / cramming

Cramming the training of a (BERT-type) language model into limited compute.

MIT License

1.29k stars 100 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Flash Attention

#48 versae closed 2 months ago
1
Determinist and update train.scheduler

#47 BoyuanFeng closed 3 months ago
0
How to load local data

#46 Doraemonzzz closed 5 months ago
2
Unable to replicate the results using the default command

#45 shiwenqin closed 5 months ago
15
From PR 43

#44 JonasGeiping closed 6 months ago
5
Fix pretraining_preparation.py - previously it would die with "BuilderConfig 'train' not found. Available: ['default']"

#43 euclaise closed 7 months ago
2
Fix two errors when running evaluation commands in README.md

#42 shiwenqin closed 7 months ago
1
Configs for GPT?

#41 rexdu2003 closed 7 months ago
2
Finetuning for token classification

#40 druskacik closed 8 months ago
3
torch._dynamo error on step 2: calling compiler function 'inductor'

#39 ionutmodo closed 9 months ago
7
TypeError: _load_optimizer() missing 1 required positional argument: 'initial_time'

#38 vincent-163 closed 9 months ago
1
can't import cramming

#37 RobinRojowiec closed 9 months ago
2
try it on Mac M1 but failed

#36 yangboz closed 9 months ago
2
Finetuning for SQuAD task

#35 kisacats closed 11 months ago
2
Uploading trained model to HF/saving in HF format locally

#34 NtaylorOX closed 11 months ago
8
Question about sparse token prediction

#33 leo-du closed 1 year ago
1
Issue with torch.compile / dynamo

#32 spencerfrei closed 1 year ago
5
Tutorial for pretrain RoBERTa with custom data

#31 iambestfeeddddd closed 1 year ago
2
I run the test command,got this error,how to fix it?looks like no dataset

#30 leescpeter closed 8 months ago
12
Evaluation failed on MNLI and STSB Datasets for Last1.13release

#29 Labyrintbs closed 1 year ago
3
GLUE evaluation numbers are very poor, if increase the sequence length to 512 and float 32

#28 tbaggu closed 1 year ago
5
Errors with both the verify installation command as well as the final recipe

#27 tatami-galaxy closed 1 year ago
2
Pretraining on a single RTX 3060

#26 TahaBinhuraib closed 1 year ago
2
fix typo in utils.py

#25 itay-nakash closed 1 year ago
0
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)` while running evaluation

#24 tbaggu closed 1 year ago
10
initialize prefetch_factor to None when num_workers is zero

#23 itay-nakash closed 1 year ago
1
data preprocessing got failed during tokenization on single GPU

#22 tbaggu closed 1 year ago
9
Cola dataset evaluation

#21 TahaBinhuraib closed 1 year ago
4
Fix bug in batch size schedule

#20 JeanKaddour closed 1 year ago
1
Add Support for `torch.compile` During Pretraining for ~15% Speedup

#19 warner-benjamin closed 1 year ago
3
Fix automodels

#18 Randl closed 1 year ago
2
add tqdm bar to eval

#17 TahaBinhuraib closed 1 year ago
1
Can't evaluate

#16 TahaBinhuraib closed 1 year ago
5
Reproduce the result when freezing parameters

#15 sleeepeer closed 1 year ago
8
Fix `save_training_checkpoint`

#14 JeanKaddour closed 1 year ago
1
Preprocessed files on S3/Google Drive

#13 tals closed 1 year ago
2
Verification command fails on macOS

#12 laclouis5 closed 1 year ago
4
Training Step Count

#11 ekurtulus closed 1 year ago
5
Suggestion : support Maximal Update Parameterization

#10 tfisher98 closed 1 year ago
2
loading checkpoints for using as a huggingface model

#9 itay-nakash closed 1 year ago
7
TypeError: _new_shared() got an unexpected keyword argument 'device'

#8 wccccp closed 1 year ago
2
preprocessed c4 dataset?

#7 w32zhong closed 1 year ago
3
Storage space requirement

#6 okpatil4u closed 1 year ago
4
added backward compatibility for typing annotations.

#5 w32zhong closed 1 year ago
0
enhancement0

#4 ss18 closed 1 year ago
0
Fix typo in curriculum_sorting.py

#3 eltociear closed 1 year ago
0
Preprocessing for final recipe

#2 florianmai closed 1 year ago
10
Update README.md

#1 eltociear closed 1 year ago
0