Streamlined `text-generation-webui` container 🍒

ms1design commented 6 months ago

Hi @dusty-nv 👋

It's a 🍒 cherry-picked PR from https://github.com/dusty-nv/jetson-containers/pull/414 to introduce improvements only for text-generation-webui container:

Reduced number of layers
Fixed formatting
Refactor torch-grammar installation commands
Add sentence-transformers and flash-attention with git patch (v1 & v2 supported)
Create symbolic links instead of duplicating files from GPTQ-for-LLaMa container. @dusty-nv do we still need GPTQ-for-LLaMa as a dependency when we use auto_gptq in text-generation-webui container?
Introduced a new way of automatic text-generation-webui extensions installation using native one_click.install_extensions_requirements script
migration from settings.json to settings.yaml - when user saves text-generation-webui UI Settings to drive, they are saved to settings.yaml, not settings.json

dusty-nv commented 6 months ago

Thanks @ms1design - just merged this, appreciate the cleanup! I noticed you added bitsandbytes back in, which I had disabled because it no longer builds for JP6 (and was slow anyways) - did that still build for you?

And GPTQ-for-Llama isn't really needed anymore no, I had just kept it around because it was already there for legacy purposes and still was compiling. Some of the unused/unmaintained packages I should just remove from the repo, like text-generation-inference too. Those were from when I was first exploring which inference APIs were fastest and are now just a support burden (during that time, things were also rapidly evolving with LLM/quantization APIs)

ms1design commented 6 months ago

Hey @dusty-nv !

I noticed you added bitsandbytes back in, which I had disabled because it no longer builds for JP6 (and was slow anyways) - did that still build for you?

Yes, bitsandbytes is buildable back again ;) you can find the working Dockerfile in my PR here: https://github.com/dusty-nv/jetson-containers/pull/420

And GPTQ-for-Llama isn't really needed anymore no...

Definitely we need a cleanup :)

dusty-nv / jetson-containers

Streamlined `text-generation-webui` container 🍒 #439