Implement the plain model loading (limited to BERT models)

kermitt2 / delft

a Deep Learning Framework for Text https://delft.readthedocs.io/

Apache License 2.0

387 stars 64 forks source link

Implement the plain model loading (limited to BERT models) #143

Closed lfoppiano closed 9 months ago

lfoppiano commented 2 years ago

This PR is to try to implement a way to load TF checkpoints and torch models in a "plain" way, so having just the ckpt files / pytorch_model.bin and the vocab.txt.

I use this for loading pre-trained BERT/SciBERT embeddings using the original implementation with delft 0.3.0.

I've used BertTokenizerFast for the tokenizer. For the models I tried to guess whether it's Pytorch or TF based on the model name.

lfoppiano commented 1 year ago

After several experiments and failures, I think this PR might not be a good choice to be integrated in DeLFT as it could lead to unpredictable behaviours (e.g. do_lower_case=True in the output configuration). Nevertheless, I find quite difficult to convert a model configuration (including the tokenizer configuration) from tensorflow to a workable huggingface format and I haven't found any useful documentation for that.