issues
search
JonasGeiping
/
cramming
Cramming the training of a (BERT-type) language model into limited compute.
MIT License
1.29k
stars
100
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Flash Attention
#48
versae
closed
2 months ago
1
Determinist and update train.scheduler
#47
BoyuanFeng
closed
3 months ago
0
How to load local data
#46
Doraemonzzz
closed
5 months ago
2
Unable to replicate the results using the default command
#45
shiwenqin
closed
5 months ago
15
From PR 43
#44
JonasGeiping
closed
6 months ago
5
Fix pretraining_preparation.py - previously it would die with "BuilderConfig 'train' not found. Available: ['default']"
#43
euclaise
closed
7 months ago
2
Fix two errors when running evaluation commands in README.md
#42
shiwenqin
closed
7 months ago
1
Configs for GPT?
#41
rexdu2003
closed
7 months ago
2
Finetuning for token classification
#40
druskacik
closed
8 months ago
3
torch._dynamo error on step 2: calling compiler function 'inductor'
#39
ionutmodo
closed
9 months ago
7
TypeError: _load_optimizer() missing 1 required positional argument: 'initial_time'
#38
vincent-163
closed
9 months ago
1
can't import cramming
#37
RobinRojowiec
closed
9 months ago
2
try it on Mac M1 but failed
#36
yangboz
closed
9 months ago
2
Finetuning for SQuAD task
#35
kisacats
closed
11 months ago
2
Uploading trained model to HF/saving in HF format locally
#34
NtaylorOX
closed
11 months ago
8
Question about sparse token prediction
#33
leo-du
closed
1 year ago
1
Issue with torch.compile / dynamo
#32
spencerfrei
closed
1 year ago
5
Tutorial for pretrain RoBERTa with custom data
#31
iambestfeeddddd
closed
1 year ago
2
I run the test command,got this error,how to fix it?looks like no dataset
#30
leescpeter
closed
8 months ago
12
Evaluation failed on MNLI and STSB Datasets for Last1.13release
#29
Labyrintbs
closed
1 year ago
3
GLUE evaluation numbers are very poor, if increase the sequence length to 512 and float 32
#28
tbaggu
closed
1 year ago
5
Errors with both the verify installation command as well as the final recipe
#27
tatami-galaxy
closed
1 year ago
2
Pretraining on a single RTX 3060
#26
TahaBinhuraib
closed
1 year ago
2
fix typo in utils.py
#25
itay-nakash
closed
1 year ago
0
RuntimeError: CUDA error: CUBLAS_STATUS_NOT_INITIALIZED when calling `cublasCreate(handle)` while running evaluation
#24
tbaggu
closed
1 year ago
10
initialize prefetch_factor to None when num_workers is zero
#23
itay-nakash
closed
1 year ago
1
data preprocessing got failed during tokenization on single GPU
#22
tbaggu
closed
1 year ago
9
Cola dataset evaluation
#21
TahaBinhuraib
closed
1 year ago
4
Fix bug in batch size schedule
#20
JeanKaddour
closed
1 year ago
1
Add Support for `torch.compile` During Pretraining for ~15% Speedup
#19
warner-benjamin
closed
1 year ago
3
Fix automodels
#18
Randl
closed
1 year ago
2
add tqdm bar to eval
#17
TahaBinhuraib
closed
1 year ago
1
Can't evaluate
#16
TahaBinhuraib
closed
1 year ago
5
Reproduce the result when freezing parameters
#15
sleeepeer
closed
1 year ago
8
Fix `save_training_checkpoint`
#14
JeanKaddour
closed
1 year ago
1
Preprocessed files on S3/Google Drive
#13
tals
closed
1 year ago
2
Verification command fails on macOS
#12
laclouis5
closed
1 year ago
4
Training Step Count
#11
ekurtulus
closed
1 year ago
5
Suggestion : support Maximal Update Parameterization
#10
tfisher98
closed
1 year ago
2
loading checkpoints for using as a huggingface model
#9
itay-nakash
closed
1 year ago
7
TypeError: _new_shared() got an unexpected keyword argument 'device'
#8
wccccp
closed
1 year ago
2
preprocessed c4 dataset?
#7
w32zhong
closed
1 year ago
3
Storage space requirement
#6
okpatil4u
closed
1 year ago
4
added backward compatibility for typing annotations.
#5
w32zhong
closed
1 year ago
0
enhancement0
#4
ss18
closed
1 year ago
0
Fix typo in curriculum_sorting.py
#3
eltociear
closed
1 year ago
0
Preprocessing for final recipe
#2
florianmai
closed
1 year ago
10
Update README.md
#1
eltociear
closed
1 year ago
0