pytorch / text

Models, data loaders and abstractions for language processing, powered by PyTorch
https://pytorch.org/text
BSD 3-Clause "New" or "Revised" License
3.5k stars 815 forks source link

Install with embeddings #1069

Open marcglobality opened 3 years ago

marcglobality commented 3 years ago

I would like to install torchtext with glove embeddings, so that when I deploy the Docker image it doesn't have to download them every time. Is there any good way (other than do a hacky wget myself and place them in the correct directory) to do this from CLI?

e.g.

pip install torchtext[glove]

or as in spacy

python -m torchtext download glove

Thanks!

zhangguanheng66 commented 3 years ago

Definitely. If you switch to the new glove vectors (link), the glove files will be downloaded once when you first use it. But keep in mind, the new glove vectors is in the nightly release so you install the nightly package

pip install --pre torch torchtext -f https://download.pytorch.org/whl/nightly/cpu/torch_nightly.html

We also provide pip and conda packages for Windows/MacOS/Linux.

marcglobality commented 3 years ago

Hi @zhangguanheng66 , thanks for your quick answer. What I would like is to have some pip command to download them before using them, so that they are saved in ~.

This way, every time I start the docker image in a different box, it doesn't have to download them the first time they are used

zhangguanheng66 commented 3 years ago

OK, then this is a new feature request.

marcglobality commented 3 years ago

OK, then this is a new feature request.

by your answer, I understand this is not possible right now (?)

zhangguanheng66 commented 3 years ago

OK, then this is a new feature request.

by your answer, I understand this is not possible right now (?)

That's right. A temporary solution is to download the glove embedding with shell script.