Downloading pre-trained model during docker build

ayushtiku5 commented 4 years ago

Whenever we call SentenceTransformer(model_name), it downloads the pre-trained model from the server locally. But this happens during the runtime. I want to run this in a docker container and I want to know if there is any command which I can add inside the Dockerfile, so that the pre-trained model gets downloaded during the docker build itself ?

nreimers commented 4 years ago

One option would be to download the model from here during build: https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/

And then unzip it during build. The model can then be loaded via SentenceTransformer('/path/to/unzipped/folder').

Another (maybe better) option would be to mount the model as volume to docker, i.e. you have the unzipped model on your host system and mount that folder so that the docker can access it. This would keep the docker image smaller, as no model must be included. Also you could swap the model by changing the volume mount command without the need to re-build your docker image.

Best Nils Reimers

igorecarrasco commented 4 years ago

I'm having issues doing this.

I've tried both the models in sbert.com and the one under Nils' link from above. Model unzips just fine and maps to the right folder. But when initializing, I'm getting:

ValueError: Unrecognized model in /seal/model/0_RoBERTa. Should have a `model_type` key in its config.json, or contain one of the following strings in its name: retribert, t5, mobilebert, distilbert, albert, bert-generation, camembert, xlm-roberta, pegasus, marian, mbart, bart, reformer, longformer, roberta, flaubert, fsmt, bert, openai-gpt, gpt2, transfo-xl, xlnet, xlm, ctrl, electra, encoder-decoder, funnel, lxmert, dpr, layoutlm, rag"

nreimers commented 4 years ago

Can you try the distilbert models and see if it works.

How does you code look like to load the model from the unzipped folder.

igorecarrasco commented 4 years ago

Just tested with one of the distilbert models. It worked fine! There's a problem with the RoBERTa models then?

On dockerfile:

ADD https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/roberta-base-nli-stsb-mean-tokens.zip .
RUN unzip ./roberta-base-nli-stsb-mean-tokens.zip -d /seal/model && rm ./roberta-base-nli-stsb-mean-tokens.zip

In the app

...
model = SentenceTransformer("/seal/model")

(I also tried /seal/model/0_RoBERTa, which didn't work due to not having a __version__ field in the config dict)

nreimers commented 4 years ago

Hi @igorecarrasco I updated the roberta models to the newest version of Huggingface config file format.

Can you retry if it works now? If there is still a version in ~/.cache/sentence_transformers, you first have to delete it to force a new download of the model

igorecarrasco commented 4 years ago

That worked perfectly. Thanks for the quick help!

ayushtiku5 commented 4 years ago

Hi @nreimers I downloaded roberta-large-nli-stsb-mean-tokens

ADD https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/roberta-large-nli-stsb-mean-tokens.zip . RUN unzip ./roberta-large-nli-stsb-mean-tokens.zip -d /skip-node-models/roberta-large-nli-stsb-mean-tokens

and then used it like model = SentenceTransformer("/skip-node-models/roberta-large-nli-stsb-mean-tokens")

but it is throwing the following error [Errno 2] No such file or directory: '/skip-node-models/roberta-large-nli-stsb-mean-tokens/0_Transformer/sentence_bert_config.json

I suppose the model should be looking for sentence_roberta_config.json instead of sentence_bert_config.json right? because during extracting files I saw sentence_roberta_config.json this file being extracted during docker build

nreimers commented 4 years ago

Hi @ayushtiku5 thanks for pointing this out. Need to fix that.

The newest version works with sentence_*_config.json, but older version expect sentence_bert_config.json.

So when you update sentence transformers, it should work.

Will try to fix the models later today.

ayushtiku5 commented 4 years ago

Is there any site like https://public.ukp.informatik.tu-darmstadt.de/reimers/sentence-transformers/v0.2/ for the latest version of sentence-transformers ?

nreimers commented 4 years ago

The model there also work for version 0.3.x of sentence transformers.

nreimers commented 4 years ago

Updated the name in the roberta models. Hope it works now.

ayushidalmia commented 3 years ago

Hi, I am facing the same issue for Sentence Transformer Bert Base models but was able to load the Roberta model.

Adding the following in config helped for Bert Base helped but I am not sure this is all is needed.

"_name_or_path": "bert-base",
  "architectures": [
    "BertModel"
  ],

chanind commented 3 years ago

Adding the following into my Dockerfile worked for downloading / setting up the model. It just causes the initial download to happen during the docker build step instead of runtime

RUN python -c 'from sentence_transformers import SentenceTransformer; SentenceTransformer("<model-name>")'

ydennisy commented 3 years ago

@chanind do you know where the model gets downloaded when using this approach?

ahmedshahriar commented 2 years ago

Thanks for this @chanind

I added the path param and this saves the pre-trained model in the models directory

RUN python3 -c "from sentence_transformers import SentenceTransformer; model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2'); model.save('models')"

fabioselau077 commented 1 year ago

Thanks for this @chanind

I added the path param and this saves the pre-trained model in the models directory
RUN python3 -c "from sentence_transformers import SentenceTransformer; model = SentenceTransformer('sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2'); model.save('models')"

How do you access via S3? When I inform the direct link or with S3 // returns me incorrect path

puppylpg commented 1 year ago

@chanind do you know where the model gets downloaded when using this approach?

/root/.cache/torch/sentence_transformers/<model_name>

UKPLab / sentence-transformers

Downloading pre-trained model during docker build #352