Error: pull model manifest: ssh: no key found - Cannot pull model from within containerized Ollama instance

We have a bespoke implementation which leverages Ollama. Up until a couple days ago - the below flow was working, now I'm seeing the error message in the title about no ssh key.

Inside a larger docker-compose.yml - we have this:

  ollama:
    build: ../Services/Backend/LLM/.
    ports:
      - 11434:11434
    volumes:
      - "../Services/Backend/LLM/models:/root/.ollama"
    environment:
      OLLAMA_HOST: "ollama:11434"
      OLLAMA_KEEP_ALIVE: "720m"
      OLLAMA_ORIGINS: "*"
    deploy:
      resources:
        reservations:
          devices:
            - driver: nvidia
              count: all
              capabilities: [gpu]
    restart: unless-stopped
    tty: true
    pull_policy: always

Our ../Services/Backend/LLM directory on the host machine has a folder structure as pictured

The build command there goes to the Dockerfile which is:

FROM ollama/ollama:latest
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]

And the entrypoint.sh script is:

#!/bin/bash

echo "Starting Ollama server..."
ollama serve &

echo "Waiting for Ollama server to be active..."
# Wait until the Ollama server is fully up and running
until ollama list | grep 'NAME'; do
  sleep 1
done

echo "Pulling gemma:7b..."
ollama pull gemma:7b

echo "Running gemma:7b model..."
ollama run gemma:7b

# Keep the container running
tail -f /dev/null

The container boots, and we see this in the logs:

2024-05-15 11:54:18 Starting Ollama server...
2024-05-15 11:54:18 Waiting for Ollama server to be active...
2024-05-15 11:54:18 Error: could not connect to ollama app, is it running?
2024-05-15 11:54:18 2024/05/15 16:54:18 routes.go:1006: INFO server config env="map[OLLAMA_DEBUG:false OLLAMA_LLM_LIBRARY: OLLAMA_MAX_LOADED_MODELS:1 OLLAMA_MAX_QUEUE:512 OLLAMA_MAX_VRAM:0 OLLAMA_NOPRUNE:false OLLAMA_NUM_PARALLEL:1 OLLAMA_ORIGINS:[* http://localhost https://localhost http://localhost:* https://localhost:* http://127.0.0.1 https://127.0.0.1 http://127.0.0.1:* https://127.0.0.1:* http://0.0.0.0 https://0.0.0.0 http://0.0.0.0:* https://0.0.0.0:*] OLLAMA_RUNNERS_DIR: OLLAMA_TMPDIR:]"
2024-05-15 11:54:18 time=2024-05-15T16:54:18.503Z level=INFO source=images.go:704 msg="total blobs: 0"
2024-05-15 11:54:18 time=2024-05-15T16:54:18.507Z level=INFO source=images.go:711 msg="total unused blobs removed: 0"
2024-05-15 11:54:18 time=2024-05-15T16:54:18.512Z level=INFO source=routes.go:1052 msg="Listening on 172.18.0.13:11434 (version 0.1.37)"
2024-05-15 11:54:18 time=2024-05-15T16:54:18.512Z level=INFO source=payload.go:30 msg="extracting embedded files" dir=/tmp/ollama3385453619/runners
2024-05-15 11:54:20 time=2024-05-15T16:54:20.237Z level=INFO source=payload.go:44 msg="Dynamic LLM libraries [cpu cpu_avx cpu_avx2 cuda_v11 rocm_v60002]"
2024-05-15 11:54:20 time=2024-05-15T16:54:20.478Z level=INFO source=types.go:71 msg="inference compute" id=GPU-45c8585d-4186-b302-7504-ed55fe54e08a library=cuda compute=8.9 driver=12.4 name="NVIDIA RTX 6000 Ada Generation" total="48.0 GiB" available="46.1 GiB"
2024-05-15 11:54:20 [GIN] 2024/05/15 - 16:54:20 | 200 |      28.844µs |     172.18.0.13 | HEAD     "/"
2024-05-15 11:54:20 [GIN] 2024/05/15 - 16:54:20 | 200 |    3.704011ms |     172.18.0.13 | GET      "/api/tags"
2024-05-15 11:54:20 NAME        ID      SIZE    MODIFIED 
2024-05-15 11:54:20 Pulling gemma:7b...
2024-05-15 11:54:20 [GIN] 2024/05/15 - 16:54:20 | 200 |      17.945µs |     172.18.0.13 | HEAD     "/"
2024-05-15 11:54:20 pulling manifest ⠋ [GIN] 2024/05/15 - 16:54:20 | 200 |  165.817164ms |     172.18.0.13 | POST     "/api/pull"
pulling manifest 
2024-05-15 11:54:20 Error: pull model manifest: ssh: no key found
2024-05-15 11:54:20 Running gemma:7b model...
2024-05-15 11:54:20 [GIN] 2024/05/15 - 16:54:20 | 200 |      12.143µs |     172.18.0.13 | HEAD     "/"
2024-05-15 11:54:20 [GIN] 2024/05/15 - 16:54:20 | 404 |     585.565µs |     172.18.0.13 | POST     "/api/show"
2024-05-15 11:54:20 [GIN] 2024/05/15 - 16:54:20 | 200 |   61.931041ms |     172.18.0.13 | POST     "/api/pull"
2024-05-15 11:54:20 pulling manifest 
2024-05-15 11:54:20 Error: pull model manifest: ssh: no key found

And this brings us to the issue - why is the container complaining about an SSH key? I'm stuck between "the ollama container environment should have an id_pub.rsa key bundled with it, and doesn't" and "Ollama thinks the model should be mounted from the host file system, and somehow this pull command is actually just saying 'go find it in the mounted volume', it does not find it, so then it tries to pull from the host which needs it's own SSH key, and that doesn't exist, so there are issues"

The biggest point of contention is that this Error message doesn't report where the Error is raised (in what file) so I can't zero-in on exactly the code that's trying to do this pull.

I guess it's here https://github.com/ollama/ollama/blob/f2cf97d6f111031a712881eccb5fbe90fac787c7/server/routes.go#L412

And I guess that the error is this:

        if err := PullModel(ctx, model, regOpts, fn); err != nil {
            ch <- gin.H{"error": err.Error()}
        }

Because we have var req api.PullRequest I can see that api has reqOptions, and I can see the Struct for PullRequest is:

// PullRequest is the request passed to [Client.Pull].
type PullRequest struct {
    Model    string `json:"model"`
    Insecure bool   `json:"insecure,omitempty"`
    Username string `json:"username"`
    Password string `json:"password"`
    Stream   *bool  `json:"stream,omitempty"`

    // Name is deprecated, see Model
    Name string `json:"name"`
}

I guess that insecure is being set to false by default, but again wild guess on my part because we see this:

        regOpts := &registryOptions{
            Insecure: req.Insecure,
        }

And I don't know where registryOptions is initialized with values

So... why are we seeing this ssh error?

docker / genai-stack

Error: pull model manifest: ssh: no key found - Cannot pull model from within containerized Ollama instance #153