dusty-nv / jetson-containers

Machine Learning Containers for NVIDIA Jetson and JetPack-L4T
MIT License
2.18k stars 446 forks source link

Streamlined `text-generation-webui` container 🍒 #439

Closed ms1design closed 6 months ago

ms1design commented 6 months ago

Hi @dusty-nv 👋

It's a 🍒 cherry-picked PR from https://github.com/dusty-nv/jetson-containers/pull/414 to introduce improvements only for text-generation-webui container:

dusty-nv commented 6 months ago

Thanks @ms1design - just merged this, appreciate the cleanup! I noticed you added bitsandbytes back in, which I had disabled because it no longer builds for JP6 (and was slow anyways) - did that still build for you?

And GPTQ-for-Llama isn't really needed anymore no, I had just kept it around because it was already there for legacy purposes and still was compiling. Some of the unused/unmaintained packages I should just remove from the repo, like text-generation-inference too. Those were from when I was first exploring which inference APIs were fastest and are now just a support burden (during that time, things were also rapidly evolving with LLM/quantization APIs)

ms1design commented 6 months ago

Hey @dusty-nv !

I noticed you added bitsandbytes back in, which I had disabled because it no longer builds for JP6 (and was slow anyways) - did that still build for you?

Yes, bitsandbytes is buildable back again ;) you can find the working Dockerfile in my PR here: https://github.com/dusty-nv/jetson-containers/pull/420

And GPTQ-for-Llama isn't really needed anymore no...

Definitely we need a cleanup :)