kermitt2 / delft

a Deep Learning Framework for Text https://delft.readthedocs.io/
Apache License 2.0
388 stars 64 forks source link

Automatically download embeddings #79

Closed kermitt2 closed 2 years ago

kermitt2 commented 4 years ago

As part of the embeddings and model management in DeLFT, download on demand word embeddings, contextualized embeddings and transformer models.

Thanks to @de-code, the functionality to automatically download embeddings is already available !

de-code commented 4 years ago

Just to add that to that, I've also added preloading of embedding as part of the Docker entrypoint via an environment variable (PRELOAD_EMBEDDING).

I did try it with xz (LZMA) compressed files but found it took too long to decompress. I then reverted to using the decompressed mdb (LMDB cache) even though it was more than twice the size. But that probably depends on the setup and network connection.