PiotrNawrot / nanoT5

Fast & Simple repository for pre-training and fine-tuning T5-style models
Apache License 2.0
970 stars 74 forks source link

Difficulty applying NanoT5 to different model and database #19

Closed sh4dmi closed 1 year ago

sh4dmi commented 1 year ago

We weren't able to find a specific documentation regarding teaching different model and database on your model. We want to teach a MT5 model which we limited to hebrew and english knowledge with a dataset of hebrew wikipedia with your implementation of T5. Is there a way to do this with your help?

PiotrNawrot commented 1 year ago

Yeah, sure. Isn't MT5 model just the T5 model, but trained on multilingual dataset? Anyways, to use any different model for the T5 pre-training you need to make sure that your forward function has compatible signature to that of the vanilla T5 model. Furthermore you need to import and include your model in this model dictionary. The model config from Huggingface is imported here, but all you need to do is to just change args.model.name to your desired model.

PiotrNawrot commented 1 year ago

Hope I helped, feel free to ask if you have more questions, happy to help