bertin-project / bertin-t5x

BERTIN Project T5X training files
Apache License 2.0
3 stars 2 forks source link

How to train the efficient T5 models ? #4

Closed StephennFernandes closed 2 years ago

StephennFernandes commented 2 years ago

Hey @versae in the new paper scale efficiently https://arxiv.org/abs/2109.10686

There are better, efficient variants of T5 and mT5 but i couldn't find these efficient models in the T5x repo.

If i have have to train these variants of T5 models how do I train them using T5x

versae commented 2 years ago

The link is here: https://console.cloud.google.com/storage/browser/scenic-bucket Just browse there. With T5X you just point to the correct gin file.

StephennFernandes commented 2 years ago

@versae thanks! btw did you try pretraining on the latest github commit on t5x. i am actually facing issues: its throwing errors on the utils.py file.
The trim_output_features=cfg.trim_output_features arg is in the seqio.get_dataset() There are issues in the LegacyCheckpointer: TypeError: Can't instantiate abstract class LegacyCheckpointer with abstract methods async_restore, async_save

versae commented 2 years ago

Sorry, no idea. Maybe try an older commit version. AFAIK, T5X is under heavy development, so chances are a given commit might not work at some point.