dusty-nv / jetson-containers

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
MIT License
1.89k stars 416 forks source link

About minimizing local_llm image size #476

Open hardychen1991 opened 2 months ago

hardychen1991 commented 2 months ago

Hi, thanks for such a great work! Just wondering if someone has done something to minimizing local_llm image size?

I've tried to build my customized image for text-only SLM inference, specifically with Gemma-2B. But the amount of base images and packages is a bit overwhelming. Any information or advices maybe? Thanks!

dusty-nv commented 2 months ago

Hi @hardychen1991, yea I feel you, have been trying to make this smaller and build faster, in fact we basically re-did most of the containers in this repo for minimization. Perhaps unsurprisingly considering what it achieves, this one has many big/complex dependencies including MLC/TVM, AWQ, FAISS, ASR/TTS, ect so it is still quite large. The local_llm has also transitioned to NanoLLM for future development, where I hope to continue making progress on issues like this: