Load transformer config and tokenizer from disk when n>1 for nfold traninig

In case we have:

a transformer in the architecture
we load initially the transformer from Hugging Face Hub

When performing a nfold training for text classification or sequence labeling, we currently reload the transformer configuration and the tokenizer via AutoModel and HuggingFace hub n times, one time for each model. In order to limit the access to Hugging Face Hub (not very reliable), we should only make an online access the first time for n=1, and then load the transformer configuration and the transformer tokenizer from file, because both have been saved when building the model for n=1.

kermitt2 / delft

Load transformer config and tokenizer from disk when n>1 for nfold traninig #125