Ollama run <model> closing unexpectedly

cbrousseauAumni commented 3 weeks ago

What is the issue?

Running ollama run mistral ""& /set nohistory & /set quiet & in a bash script as the entrypoint to a docker container, in an attempt to start mistral right after starting the ollama server and pulling mistral

Output of ps aux | grep ollama | grep -v grep is 0 These are the outputs of running subprocesses: 7 root 0:31 {ollama} /run/rosetta/rosetta /usr/bin/ollama ollama serve 30 root 0:00 {ollama} /run/rosetta/rosetta /usr/bin/ollama ollama run mistral

But when I go to access the model: $curl http://0.0.0.0:11434 - Ollama is running litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - {"error":"model \"mistral\" not found, try pulling it first"}

If I manually exec into the container and run ollama run mistral it tries to pull the model all over again despite the model already having been pulled during build.

Any help would be wonderful, let me know if there are relevant details missing.

OS

Linux

GPU

Other

CPU

Intel

Ollama version

ollama version is 0.0.0 is the output

cbrousseauAumni commented 3 weeks ago

No GPU, CPU only

cbrousseauAumni commented 3 weeks ago

Further addition: When trying to install an update to ollama, the linux amd64 version is downloaded instead of the linux x86_64 with the install script.

rick-github commented 3 weeks ago

It's not clear from your summary what the actual contents of your entrypoint is, if it's literally what you have there I would expect several errors.

If you are running standalone docker, try:

$ docker run --rm -d --entrypoint bash --name ollama-pull ollama/ollama -c '(sleep 2 ; ollama pull qwen:0.5b) & exec ollama serve'
f16f5557aeff86f0089ba256490d0cb60656638fe5e6560e1424e3d87c5f94e4
$ docker exec -it ollama-pull ollama list
NAME            ID              SIZE    MODIFIED     
qwen:0.5b       b5dc5e784f2a    394 MB  1 second ago
$ docker exec -it ollama-pull ollama run qwen:0.5b hello
Hi! How can I help you today?

If you are using docker compose:

services:
  ollama:
    image: ollama/ollama
    entrypoint: bash -c '(sleep 2 ; ollama pull ${MODEL-mistral}) & exec ollama serve'

amd64 is the same as x86_64.

cbrousseauAumni commented 3 weeks ago

I'm actually not running the ollama docker image anywhere, it's a chainguard image.

Ollama stuff in the dockerfile:

# install and run ollama
RUN curl -fsSL https://ollama.com/install.sh | sh
ENV GIN_MODE="release"
ENV OLLAMA_HOST=http://0.0.0.0:11434
RUN ollama serve & sleep 5 && ollama pull mistral

This runs fine

Contents of the entrypoint:

#!/bin/sh
ollama serve & sleep 5
ollama run mistral ""& /set nohistory & /set quiet &

This seems to run fine, but then after the entrypoint executes there's the issues mentioned above.

rick-github commented 3 weeks ago

When you build the image, ollama is running as root and the models are stored in /root/.ollama. I'm not familiar with the format of the output of ps in your summary, but if I'm interpreting '{ollama}' correctly, you are running the server as user ollama. That doesn't really match the contents of your entrypoint file - some of the other stuff from the the Dockerfile might shed light on how you are starting the server. In any case, if it's running as ollama, the models will be stored in /usr/share/ollama. You can override this with OLLAMA_MODELS.

Note that your entrypoint file has problems. /set nohistory and /set quiet are not consumed by the ollama run command, the shell will try to execute the command /set with arguments nohistory and quiet. If the entrypoint file is the sole command executed when the container starts (ie, not part of some run system) then the container will terminate when the ollama run command completes.

cbrousseauAumni commented 3 weeks ago

Great, I think this conversation may have helped me see how to solve several of the problems. So I can set OLLAMA_MODELS to the .ollama/models directory where mistral is being pulled during build. Is there a way of making sure nohistory and quiet are in place, either through env vars or someplace else? This is a read-only file system, otherwise it wouldn't be important.

rick-github commented 3 weeks ago

I'm not sure why you want to set nohistory and quiet, since stdin of ollama run won't read from a terminal because you gave it an argument. You can use OLLAMA_NOHISTORY=1 to disable interactive history, and quiet is the default so you don't need to set it.

cbrousseauAumni commented 3 weeks ago

Ok, I've modified the entry_script.sh to be this:

#!/bin/sh
ollama serve & sleep 5
ollama run mistral "" &

Are there any other changes that need to happen here?

As far as the docker image, it's a chainguard image, and these are the Ollama env, but nothing else related to ollama is in there:

RUN mkdir '/tmp/ollama'
ENV OLLAMA_HISTORY="/tmp/ollama"
ENV OLLAMA_TMPDIR="/tmp/ollama"
ENV OLLAMA_ORIGINS=*
ENV OLLAMA_MODELS=${FUNCTION_DIR}/.ollama/models/

I really just need to make sure that ollama run mistral is running when we get to the POST requests, so I don't get connection refused errors.

rick-github commented 3 weeks ago

ollama run mistral is not really required, ollama will load the model when the first request is received, although you will save a couple of seconds of response time for that first query. You also probably want to set OLLAMA_KEEP_ALIVE=-1 to stop the model from being unloaded when it's idle, and OLLAMA_NUM_PARALLEL=1 to save some VRAM if the model is not expected to run multiple concurrent completions.

Again, if the entrypoint is the sole command executed at container start, the container will terminate when the command finishes execution.

#!/bin/sh
(sleep 2 ; ollama run mistral "") &
exec ollama serve

cbrousseauAumni commented 3 weeks ago

Beautiful, thank you for helping me understand it better. I'll test this now.

igorschlum commented 3 weeks ago

thank you @rick-github nice detailed explainations

cbrousseauAumni commented 3 weeks ago

Unfortunately, when Ollama gets its first request, I get this error, this is the same one I've been getting: litellm.exceptions.APIConnectionError: litellm.APIConnectionError: OllamaException - {"error":"model \"mistral\" not found, try pulling it first"}

rick-github commented 3 weeks ago

If you provide the full Dockerfile and dependencies, it will aid in debugging.

cbrousseauAumni commented 3 weeks ago

When the server runs, it outputs this: Couldn't find '~/.ollama/id_ed25519'. Generating new private key. But this is the output of ls -a in ~/.ollama:

ls -a ~/.ollama/
.               ..              id_ed25519      id_ed25519.pub  models

Are we sure that overriding with OLLAMA_MODELS is working, or is there another environment variable I'm missing?

cbrousseauAumni commented 3 weeks ago

Here's the dockerfile:


ARG FUNCTION_DIR='/function'

# Used to include pip packages only since the prod image does not include pip
FROM chainguard-image AS build-image
ARG FUNCTION_DIR

USER root
RUN mkdir -p ${FUNCTION_DIR}

# make a virtual environemnt in function dir
RUN python -m venv ${FUNCTION_DIR}/venv
COPY requirements.txt ${FUNCTION_DIR}/requirements.txt
RUN ${FUNCTION_DIR}/venv/bin/pip install --no-cache-dir awslambdaric

# Install dev dependencies
RUN apk update && apk add --no-cache git wget cmake zip

# install python deps
RUN ${FUNCTION_DIR}/venv/bin/pip install --no-cache-dir wheel
RUN ${FUNCTION_DIR}/venv/bin/pip install --no-cache-dir -r ${FUNCTION_DIR}/requirements.txt

FROM company-image
USER root
WORKDIR ${FUNCTION_DIR}

#Install any needed dependencies into final image
RUN apk update &&  apk add --no-cache git git-lfs wget libgcc
RUN git lfs install

# the code
COPY config ${FUNCTION_DIR}/config
COPY app ${FUNCTION_DIR}/app
COPY tests ${FUNCTION_DIR}/tests

COPY --chmod=755 entry_script.sh ${FUNCTION_DIR}

# copy over ollama deps
RUN mkdir '/tmp/ollama'
ENV OLLAMA_HISTORY="/tmp/ollama"
ENV OLLAMA_TMPDIR="/tmp/ollama"
ENV OLLAMA_ORIGINS=*
ENV OLLAMA_MODELS=${FUNCTION_DIR}/.ollama/models
COPY ./ollama_install.sh ${FUNCTION_DIR}
RUN chmod +x ${FUNCTION_DIR}/ollama_install.sh
RUN ${FUNCTION_DIR}/ollama_install.sh
ENV GIN_MODE="release"
ENV OLLAMA_HOST=http://0.0.0.0:11434
RUN ollama serve & sleep 5 && ollama pull mistral
#COPY models /usr/share/ollama/.ollama/models

# copy over Chroma deps
RUN git clone https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2 ${FUNCTION_DIR}/MiniLM
RUN rm -rf ${FUNCTION_DIR}/MiniLM/.git

# Copy in the built dependencies
COPY --from=build-image ${FUNCTION_DIR} ${FUNCTION_DIR}

# set env vars
ENV HOME=${FUNCTION_DIR}
#ENV OLLAMA_KEEP_ALIVE=-1
ENV ANONYMIZED_TELEMETRY=False

# set up a runtime interface client as default command for the container runtime
ENTRYPOINT ["./entry_script.sh"]

# Set the CMD to your handler (could also be done as a parameter override outside of the Dockerfile)
CMD [ "app.function.handler" ]

rick-github commented 3 weeks ago

Contents of ollama_install.sh, entry_script.sh? What's the base for company-image?

cbrousseauAumni commented 3 weeks ago

ollama_install.sh is the vanilla install/upgrade script from here: https://ollama.com/install.sh entry_script.sh is up above, company-image base is wolfi

cbrousseauAumni commented 3 weeks ago

One more piece of info: When I exec into the docker container and check whether ollama run mistral works, I get this:

I don't know whether the model is actually completely downloading during the docker build process when ollama pull mistral is being run.

rick-github commented 3 weeks ago

I substituted company-image with cgr.dev/chainguard/wolfi-base and commented out the the stuff related to building the app. entry_script.sh was changed to:

#!/bin/sh
(sleep 2 ; ollama run mistral "") &
ollama serve &
exec $@

$ docker build -f Dockerfile -t 7301 .
# since I don't have an app, I pass a `sleep` command to the container so that it won't error out
$ docker run --rm -d  -p 11430:11434 --name 7301 7301 sleep 300
$ docker exec 7301 ollama list
NAME              ID              SIZE      MODIFIED       
mistral:latest    f974a74358d6    4.1 GB    26 minutes ago    
$ curl -s localhost:11430/api/tags | jq
{
  "models": [
    {
      "name": "mistral:latest",
      "model": "mistral:latest",
      "modified_at": "2024-10-22T20:36:53.105958546Z",
      "size": 4113301824,
      "digest": "f974a74358d62a017b37c6f424fcdf2744ca02926c4f952513ddf474b2fa5091",
      "details": {
        "parent_model": "",
        "format": "gguf",
        "family": "llama",
        "families": [
          "llama"
        ],
        "parameter_size": "7.2B",
        "quantization_level": "Q4_0"
      }
    }
  ]
}
$ curl -s localhost:11430/api/generate -d '{"model":"mistral","prompt":"hello","stream":false}' | jq -r .response
 Hello! How can I help you today? 😊

If you have any questions or need assistance with something, feel free to ask! I'm here to help. If you just want to chat about a specific topic or share your thoughts, that's cool too. Let me know what you need!

Note that you set FUNCTION_DIR at the top of the Dockerfile but not after the FROM statements, which reset variables. So the models are being stored in /.ollama/models.

$ docker exec 7301 env
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=887108e16a9c
SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
OLLAMA_HISTORY=/tmp/ollama
OLLAMA_TMPDIR=/tmp/ollama
OLLAMA_ORIGINS=*
OLLAMA_MODELS=/.ollama/models
GIN_MODE=release
OLLAMA_HOST=http://0.0.0.0:11434
HOME=/root
ANONYMIZED_TELEMETRY=False

cbrousseauAumni commented 2 weeks ago

I think I'm seeing where the problem is, but I don't know how to solve it. I changed the entry_script.sh to be the same as above.

$ docker build --platform linux/x86_64 -t docker-image --build-arg GITHUB_TOKEN=github_token .
$ docker run --platform linux/x86_64 --name docker-image -dit docker-image sleep infinity
$ docker exec -it data-extractor ollama list
Error: could not connect to ollama app, is it running?

What's next:
    Try Docker Debug for seamless, persistent debugging tools in any container or image → docker debug data-extractor
    Learn more at https://docs.docker.com/go/debug-cli/
$ docker exec -it docker-image env        
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
HOSTNAME=6e8481d8e66c
TERM=xterm
SSL_CERT_FILE=/etc/ssl/certs/ca-certificates.crt
OLLAMA_HISTORY=/tmp/ollama
OLLAMA_TMPDIR=/tmp/ollama
OLLAMA_ORIGINS=*
OLLAMA_MODELS=/function/.ollama/models
GIN_MODE=release
OLLAMA_HOST=http://0.0.0.0:11434
HOME=/function
OLLAMA_KEEP_ALIVE=-1
ANONYMIZED_TELEMETRY=False

Are my docker commands doing something unintended? Or is the entry_script not being called for some reason?

rick-github commented 2 weeks ago

What do you have OLLAMA_HOST set to in the data-extractor container? What's the result of docker exec -it data-extractor env?

cbrousseauAumni commented 2 weeks ago

It worked! But there's an issue that I was hoping to avoid talking about, which is a runtime interface emulator & client.

These are getting in each others' way in the entry_script, where only one of them works at a time rn. Could you help me modify the entry_script so that both of these work at once?

Current script:

#!/bin/sh
cd /function

if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
  cd /function && exec /usr/local/bin/aws-lambda-rie /function/venv/bin/python -m awslambdaric $1
else
  cd /function && exec /function/venv/bin/python -m awslambdaric $1
fi

(sleep 2 ; ollama run mistral "") &
ollama serve &
exec $@

rick-github commented 2 weeks ago

#!/bin/sh

ollama serve &
sleep 2
ollama run mistral ""

cd /function
if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
  exec /usr/local/bin/aws-lambda-rie /function/venv/bin/python -m awslambdaric $1
fi
exec /function/venv/bin/python -m awslambdaric $1

Note that if the python script exits, the container will also exit.

rick-github commented 2 weeks ago

If you want the python script to be able to exit without killing the container:


#!/bin/sh

(
sleep 2
ollama run mistral ""
cd /function
if [ -z "${AWS_LAMBDA_RUNTIME_API}" ]; then
  exec /usr/local/bin/aws-lambda-rie /function/venv/bin/python -m awslambdaric $1
fi
exec /function/venv/bin/python -m awslambdaric $1
) &
exec ollama serve

cbrousseauAumni commented 2 weeks ago

Beautiful, everything is fixed. Thank you @rick-github, you've really helped here above and beyond.

ollama / ollama