run_langauge_modeling for T5

huggingface / transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

https://huggingface.co/transformers

Apache License 2.0

134.22k stars 26.84k forks source link

run_langauge_modeling for T5 #10169

Closed ghost closed 3 years ago

ghost commented 3 years ago

Hi Based on readme on [1], run_langauge_modeling.py does not support T5 model so far, it would be really nice to include this model as well. There is also this line "data_args.block_size = tokenizer.max_len", max_len does not exist anymore, I searched in pretrainedTokernizer class and did not find an equivalent variable to substitue, do you mind telling me how I can update this line to make this example work?

thank you.

[1] https://github.com/huggingface/transformers/blob/master/examples/legacy/run_language_modeling.py

ghost commented 3 years ago

Seems to me this script is the repetition of this other script: transformers/examples/language-modeling/run_mlm.py Do you mind adding T5 also to this script? thanks

NielsRogge commented 3 years ago

Actually, we can not simply add T5 to this script, because run_mlm.py is for encoder-only models (such as BERT, RoBERTa, DeBERTa, etc.). T5 is an encoder-decoder (seq2seq) model, so this would require a new script. The seq2seq scripts currently only support fine-tuning, not pre-training.

cc @patil-suraj @sgugger

github-actions[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.