Atinoda / text-generation-webui-docker

Docker variants of oobabooga's text-generation-webui, including pre-built images.
GNU Affero General Public License v3.0
364 stars 77 forks source link

Update Ubuntu version to 24.04 #52

Open adamoutler opened 1 week ago

adamoutler commented 1 week ago

Ubuntu 22.04LTS comes with Python 3.10. 22.04LTS is replaced by 24.04LTS. 24.04LTS has Python 3.11. Ubuntu 22.04 with Python 3.10 is causing issues with LLM_Web_search.

LLM_Web_search | uber generated an exception: catch_warnings.__init__() got an unexpected keyword argument 'action'

image

According to LLM_Web_Search dev, LLM_Web_Search requires Python 3.11, and Python 3.10 will not work.

Update to 23.**+ is required for plugins which are used in Textgen webui

Atinoda commented 1 week ago

Hi @adamoutler - thank you for raising your issue. The error message does not look a python version issue to me, but I will take a note to check the compatibility of this plug-in.

adamoutler commented 1 week ago

Plugin dev identified as python issue and recommended Ubuntu 23+.

Atinoda commented 1 week ago

Thanks - I have now read the conversation you had with mamel16. Upgrading the python version requires a lot of testing on my end - this application is a leaning tower of machine learning dependencies - but I will roll it into some other major updates that I have in mind.

Then again, sometimes it just works right away! You are welcome to modify the Docker file's base image if you'd like to give it a go.

adamoutler commented 1 week ago

I understand. I attempted to do so myself already and many of the pip packages have been moved into Ubuntu package repositories. While I'm no stranger to dependency management, and I do like the stability from package managers over individual packaging, I'm positive there are differences that will hinder me further.

I may have some time to continue later. Even building this on a 20 thread processor took a very long time. Not sure what I can do to speed it up. Any recommendations on making a smaller build from your multi-state Dockerfile?

Atinoda commented 6 days ago

Glad that you're trying it out! Please feel welcome to share your progress and experience.

Make sure that you specify the target when you're building and then it will skip the sections that it doesn't need. I'd imagine you want nvidia-default.

Docker builds cache their steps by default and will only re-build if something changes. Bear this in mind when you're tweaking things - try to put test variations and experiments after the longer build steps.

The default image takes about five minutes to build on my 5950x - I would expect your times to be similar to that.

adamoutler commented 3 days ago

So it looks like we're basically required to let Ubuntu manage the virtual environment in 24.04.

####################
### BUILD IMAGES ###
####################

# COMMON
FROM ubuntu:24.04 AS app_base
# Pre-reqs
RUN apt-get update && apt-get install --no-install-recommends -y \
    git vim build-essential python3-dev python3-venv python3-pip\
    python3-virtualenv
# Instantiate venv and pre-activate
RUN virtualenv /venv
# Credit, Itamar Turner-Trauring: https://pythonspeed.com/articles/activate-virtualenv-dockerfile/
ENV VIRTUAL_ENV=/venv
RUN python3 -m venv $VIRTUAL_ENV
ENV PATH="$VIRTUAL_ENV/bin:$PATH"

...

The only changes required were to update the image, and then use the apt-get package manager for the initial packages instead of pip. I'm 500s into the build on step 16/29. Got other stuff going on. Will check in later with results.

adamoutler commented 2 days ago

Looks like it's expecting 3.10 but getting 3.10.9?

 => ERROR [app_nvidia 2/2] RUN pip3 install -r /app/requirements.txt                                                                                25.0s
------
 > [app_nvidia 2/2] RUN pip3 install -r /app/requirements.txt:
6.047 Ignoring llama-cpp-python: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"' don't match your environment
6.047 Ignoring llama-cpp-python: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"' don't match your environment
6.048 Ignoring llama-cpp-python: markers 'platform_system == "Windows" and python_version == "3.11"' don't match your environment
6.048 Ignoring llama-cpp-python: markers 'platform_system == "Windows" and python_version == "3.10"' don't match your environment
6.049 Ignoring llama-cpp-python-cuda: markers 'platform_system == "Windows" and python_version == "3.11"' don't match your environment
6.049 Ignoring llama-cpp-python-cuda: markers 'platform_system == "Windows" and python_version == "3.10"' don't match your environment
6.049 Ignoring llama-cpp-python-cuda: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"' don't match your environment
6.050 Ignoring llama-cpp-python-cuda: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"' don't match your environment
6.050 Ignoring llama-cpp-python-cuda-tensorcores: markers 'platform_system == "Windows" and python_version == "3.11"' don't match your environment
6.050 Ignoring llama-cpp-python-cuda-tensorcores: markers 'platform_system == "Windows" and python_version == "3.10"' don't match your environment
6.051 Ignoring llama-cpp-python-cuda-tensorcores: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"' don't match your environment
6.051 Ignoring llama-cpp-python-cuda-tensorcores: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"' don't match your environment
6.051 Ignoring exllamav2: markers 'platform_system == "Windows" and python_version == "3.11"' don't match your environment
6.052 Ignoring exllamav2: markers 'platform_system == "Windows" and python_version == "3.10"' don't match your environment
6.052 Ignoring exllamav2: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"' don't match your environment
6.053 Ignoring exllamav2: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"' don't match your environment
6.053 Ignoring exllamav2: markers 'platform_system == "Linux" and platform_machine != "x86_64"' don't match your environment
6.053 Ignoring flash-attn: markers 'platform_system == "Windows" and python_version == "3.11"' don't match your environment
6.054 Ignoring flash-attn: markers 'platform_system == "Windows" and python_version == "3.10"' don't match your environment
6.054 Ignoring flash-attn: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.11"' don't match your environment
6.054 Ignoring flash-attn: markers 'platform_system == "Linux" and platform_machine == "x86_64" and python_version == "3.10"' don't match your environment
6.804 Collecting accelerate==0.30.* (from -r /app/requirements.txt (line 1))
6.996   Downloading accelerate-0.30.1-py3-none-any.whl.metadata (18 kB)
8.758 Collecting aqlm==1.1.5 (from aqlm[cpu,gpu]==1.1.5->-r /app/requirements.txt (line 2))
8.801   Downloading aqlm-1.1.5-py3-none-any.whl.metadata (1.7 kB)
10.30 Collecting auto-gptq==0.7.1 (from -r /app/requirements.txt (line 3))
10.34   Downloading auto_gptq-0.7.1.tar.gz (126 kB)
10.45      ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 126.1/126.1 kB 1.1 MB/s eta 0:00:00
10.57   Installing build dependencies: started
21.08   Installing build dependencies: finished with status 'done'
21.08   Getting requirements to build wheel: started
21.56   Getting requirements to build wheel: finished with status 'error'
21.57   error: subprocess-exited-with-error
21.57
21.57   × Getting requirements to build wheel did not run successfully.
21.57   │ exit code: 1
21.57   ╰─> [1 lines of output]
21.57       Building cuda extension requires PyTorch (>=1.13.0) being installed, please install PyTorch first: No module named 'torch'
21.57       [end of output]
21.57
21.57   note: This error originates from a subprocess, and is likely not a problem with pip.
21.57 error: subprocess-exited-with-error
21.57
21.57 × Getting requirements to build wheel did not run successfully.
21.57 │ exit code: 1
21.57 ╰─> See above for output.
21.57
21.57 note: This error originates from a subprocess, and is likely not a problem with pip.
------
Dockerfile:43
--------------------
  41 |         --index-url https://download.pytorch.org/whl/cu121
  42 |     # Install oobabooga/text-generation-webui
  43 | >>> RUN pip3 install -r /app/requirements.txt
  44 |
  45 |     # Extended
--------------------
ERROR: failed to solve: process "/bin/sh -c pip3 install -r /app/requirements.txt" did not complete successfully: exit code: 1
(base) root@shark-wrangler:~/text-generation-webui-docker# python --version
Python 3.10.9
Atinoda commented 2 days ago

Hi @adamoutler, thank you for trying the update and sharing your results - it's useful to know your experience with it. Seems like we did not get lucky with just changing the base image - it was worth a shot, though!

I'm not sure why the venv would need to be managed by Ubuntu... I would prefer to keep it via pip. I'll have a look at installing a different python version in a venv, and also consider using conda-forge environment instead. The first option is a smaller change, the second option is a larger rework.

adamoutler commented 2 days ago

The reason for apt install is when working directly on the OS, you must now either apt or pip ... --break-system-packages. It's cleaner to use the os system package to install venv, then hop into the venv and install whatever is required.

Atinoda commented 2 days ago

I'm not quite sure I follow - isn't that the same as the current approach in the Dockerfile?

adamoutler commented 2 days ago

No. pip is used to obtain the virtual environment before using it.

https://github.com/Atinoda/text-generation-webui-docker/blob/master/Dockerfile#L11

Atinoda commented 1 day ago

Thanks for clarifying - I can see the distinction now. Just using apt-get to install the virtualenv package, instead of pip. I'm planning to look at this one over the summer break, so it's helpful to have a head-start.

Seems like there will be a need to manage simultaneous 3.10 and 3.11 environments starting from v1.8 of textgen - it'll be interesting navigating that!

adamoutler commented 1 day ago

Is there any requirement to use 3.10? Seems like 3.11 is the way.

Actually I was also wondering; Is there any requirement to use Ubuntu? It might be cleaner to start with an Alpine container and build in exactly what is required. I'm considering doing this because

  1. The alpine packages are significantly faster to install
  2. The overhead/image sizes are smaller
  3. The attack surface is decreased because no default services exist
  4. Once you have python/pip I'm not really seeing any dependencies on the system. The one thing I'm really concerned about is how to install Nvidia drivers on Alpine.
Atinoda commented 15 hours ago

The release notes mention that 3.10 is required for TensorRT-LLM - apparently, it's the new and fastest backend in the project. I haven't had time to test it yet though.

Regarding Ubuntu as a base, I used to have a nvidia machine learning image as the base - which produced even larger images - so Ubuntu was the slimming down effort at the time plus the benefits of generalisation. I'm a big fan of alpine based images for embedded systems and micro-services, and use them regularly there - and you raise valid points about their strengths.

Given that text-generation-webui is for running LLMs - which are absolute resource monsters - I don't feel that the image needs to be super minimal or lightweight. With the extra packages and built-in capabilities of Ubuntu, it also helps with things like installing the Nvidia drivers, as you mentioned - which can be a fiasco. I also feel it's a better basis to use for development (which this image supports with the local source build option), and is more accessible to Linux newcomers.

The build times can indeed be slow and painful for this image and its variants - and that was one of my main motivations for setting up the project and pushing the pre-built versions to docker hub. Textgen is a fast-moving and cutting-edge project rather than mature and stable software - slimming its deployment down and tuning dependencies when they might change tomorrow is making a rod for your own back. Rather the goal is to ease its deployment, increase accessibility, and get people up and running quickly - but still offer the option to get hands-on and build or tweak it yourself.

None of these things are a requirement as such - very few things in life are - but that's my current thinking and motivation. Spend more of my finite resources on developing features and less on minimising the image footprint.