Closed sh4dmi closed 1 year ago
Yeah, sure. Isn't MT5 model just the T5 model, but trained on multilingual dataset?
Anyways, to use any different model for the T5 pre-training you need to make sure that your forward function has compatible signature to that of the vanilla T5 model.
Furthermore you need to import and include your model in this model dictionary.
The model config from Huggingface is imported here, but all you need to do is to just change args.model.name
to your desired model.
Hope I helped, feel free to ask if you have more questions, happy to help
We weren't able to find a specific documentation regarding teaching different model and database on your model. We want to teach a MT5 model which we limited to hebrew and english knowledge with a dataset of hebrew wikipedia with your implementation of T5. Is there a way to do this with your help?