Closed sammcj closed 3 weeks ago
@sammcj the dependencies include torch
to run the cot_decoding
and entropy_decoding
approachs that are implemented in PyTorch -
https://github.com/codelion/optillm/blob/94fad7846e82cd24f4603a4da7019ba242f40be3/requirements.txt#L7C1-L7C6
You can try commenting torch and transformers dependencies in the requirements.txt file and see if it helps if you are not going to use them.
Thanks, that indeed reduced the image size from 6.36GB to 950MB!
FYI / for context - the reason I'm trying to cut this down is I'm looking at embedding Optillm within a Lambda I've written that provides an OpenAI compatible API in front of LLMs running on Amazon Bedrock (or technically - any LLMs running on AWS).
If I get the Optillm integration working nicely I'll be sure to give a shout out to the project and share the link here :)
Thanks for trying out optillm @sammcj ! Let me know if you need any more help with your setup.
Looking at the resulting built image, the image is getting blown out by cudnn (no surprises there):
(This part):
I'm just wondering if cudnn needs to be baked in to the image, or if perhaps the application only needs some specific libraries that might significantly reduce the size?