PiotrNawrot nanoT5 issues

PiotrNawrot / nanoT5

Fast & Simple repository for pre-training and fine-tuning T5-style models

Apache License 2.0

957 stars 70 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How to change training objective from next token prediction to Masked Language Modeling?

#45 HaninZeyad opened 3 hours ago
0
A possible bug in the generate method

#44 SiyuanHuangSJTU closed 2 days ago
1
Focus

#43 peter-sk closed 3 weeks ago
0
The weird curve

#42 nguyenvannghiem0312 opened 3 weeks ago
1
How to replace the tokenizer with another one?

#41 hifarer opened 1 month ago
2
nanoT5 for different embeddings

#40 victoriazinkovich closed 1 month ago
2
checkpoint-pt-151 does not appear to have a file named config.json

#39 dinhngoc267 closed 2 months ago
1
About Pre-training objectives

#38 SoshyHayami closed 3 months ago
1
pre-training on local C4 dataset?

#37 TTTTCoding closed 3 months ago
1
Continued pretraining from official models.

#36 IdeaKing closed 3 months ago
1
Just a quick question to pretrain Flan-T5

#35 hohoCode closed 4 months ago
5
Learning rate for multi-GPUs training

#34 phucdoitoan closed 5 months ago
3
Beginner Question : Would it be wise to use this as a backbone for custom seq2seq modeling fMRI data and custom encoder?

#33 dyhan316 closed 5 months ago
2
Question about implementing whole word masking in nanoT5

#32 brick-pid closed 5 months ago
1
Silly question: Why do you need to re-implement T5 model?

#31 phucdoitoan closed 5 months ago
3
How to create pytorch_model.bin file?

#30 mayanks43 closed 5 months ago
1
Larger models and training on the Pile

#29 Taytay closed 7 months ago
5
Flash attention

#28 Taytay closed 7 months ago
2
Pre-train on different Dataset than C4

#27 nikifori closed 7 months ago
1
Transformation to HF model

#26 ghost closed 8 months ago
0
nanoT5 initializes lm_head weights with 768x too much variance, probably

#25 Birch-san opened 8 months ago
19
self-defined loss function failed to work (torch._dynamo.exc.InternalTorchDynamoError: ln_encoder)

#24 QinengWang-Aiden closed 1 year ago
4
Update README.md

#23 eltociear closed 7 months ago
0
Pre-training fails at step 30155 out of 32768 steps every time

#22 QinengWang-Aiden closed 1 year ago
7
About pre-training on another dataset

#21 tarudesu closed 1 year ago
7
AttributeError: Can't pickle local object 'IterableDataset.map.<locals>.<lambda>'

#20 turian closed 1 year ago
1
Difficulty applying NanoT5 to different model and database

#19 sh4dmi closed 1 year ago
2
How to run on CPU

#18 ratan-prasad closed 1 year ago
1
Change citing style, Remove previous graphics

#17 PiotrNawrot closed 1 year ago
0
pre-train on long context.

#16 enpassanty closed 1 year ago
1
RMS scaling issues

#15 SmerkyG closed 1 year ago
15
Shape mismatch warning

#14 TuTruongVian closed 1 year ago
1
[nanoT5 v1.1] Mixed precision training + Export T5 model

#13 PiotrNawrot closed 1 year ago
0
query regrading muti-gpu

#12 trinanjan12 closed 1 year ago
9
Pre training on my own dataset

#11 trinanjan12 closed 1 year ago
1
Error enountered during multi-GPU training with torch compile enabled

#10 jzhang38 closed 1 year ago
2
Why isn't the lr warm up from 0?

#9 jzhang38 closed 1 year ago
1
fix a bug about total steps calculation by epoch nums.

#8 QizhiPei closed 1 year ago
1
Resume the pre-training process

#7 QizhiPei closed 1 year ago
5
Pre-trained nanoT5 model on C4 corpus

#6 SungHo3268 closed 1 year ago
5
fine-tuning error: No module named adaptive.moe

#5 fancyisbest closed 1 year ago
2
have you try any other benchmark other than SNI?

#4 zixiliuUSC closed 1 year ago
1
Computing Rouge score during training

#3 sjelassi closed 1 year ago
2
Update README.md

#2 PiotrNawrot closed 1 year ago
0
Citing Repo

#1 dhairyadalal closed 1 year ago
4