michaelfeil / infinity

Infinity is a high-throughput, low-latency REST API for serving vector embeddings, supporting a wide range of text-embedding models and frameworks.
https://michaelfeil.eu/infinity/
MIT License
977 stars 72 forks source link

shrink: docker image size by pruning venv #139

Closed peebles closed 3 months ago

peebles commented 3 months ago

The docker image michaelf34/infinity:latest is about 6.5G uncompressed. Exploring this, I noticed inside the container:

# du -sh /root/rerank-test/.venv/ /app/.venv/
5.4G    /root/rerank-test/.venv/
5.6G    /app/.venv/

This adds up to more than 6.5 though, so I am not sure what is going on. But something might have leaked past .dockerignore when this container was built.

REPOSITORY                               TAG       DIGEST                                                                    IMAGE ID       CREATED        SIZE
michaelf34/infinity                      latest    sha256:42f31eeb195eec83960f8b505887aa8f4da64c7cdeddafd6f2fd6a7cbd008162   a8e432629682   2 days ago     6.5GB

I may not be using the very latest image.

michaelfeil commented 3 months ago

@peebles Impressive right - Just 6GB, and compressed around 3. Sadly, there is literally not much to improve, its mostly the drivers and the python venv with required packages. It could be smaller, e.g. for a cpu / onnx container could go below 1GB.

To be fair, pytorch + cudnn packages add up fast - I used multi-stage docker builds to compress what I could.

For reference: here are some similar projects.

https://hub.docker.com/r/anibali/pytorch/tags https://hub.docker.com/layers/winglian/axolotl-cloud/main-py3.10-cu118-2.1.2/images/sha256-9d5a353eb30494e9835a70cc9760de48ea2c95a5c9232e14beda214900440219?context=explore

peebles commented 3 months ago

Yes, but what is: /root/rerank-test/.venv/ ?

peebles commented 3 months ago

When doing Python based lambda functions in AWS, I found that striping the libraries was a big saving. Something like

find .venv -name ".so" -not -path "scipy/special/" -not -path "scipy/sparse/linalg/" -not -path "scipy/linalg/" -not -path "scipy/optimize/" -not -path "scipy/integrate/*" -exec strip {} \;

The scipy stuff in there because stripping scipy caused runtime errors. I am not saying you should do this, but something to consider. In AWS I had no choice.

peebles commented 3 months ago

Ah! Forget the /root/rerank-test ... that's my test! So sorry!! I'm lossin it man!

michaelfeil commented 3 months ago

@peebles No worries, still interested in pruning some libraries - but slightly worried that most *.so files might be needed by pytorch and that doing so will hurt development velocity.

michaelfeil commented 3 months ago

Closing due to inactivity. Feel free to reopen if you found something!