Closed kermitt2 closed 2 years ago
Just to add that to that, I've also added preloading of embedding as part of the Docker entrypoint via an environment variable (PRELOAD_EMBEDDING
).
I did try it with xz
(LZMA) compressed files but found it took too long to decompress. I then reverted to using the decompressed mdb
(LMDB cache) even though it was more than twice the size. But that probably depends on the setup and network connection.
As part of the embeddings and model management in DeLFT, download on demand word embeddings, contextualized embeddings and transformer models.
Thanks to @de-code, the functionality to automatically download embeddings is already available !