replicate / cog-stable-diffusion

Diffusers Stable Diffusion as a Cog model
https://replicate.com/stability-ai/stable-diffusion
Apache License 2.0
640 stars 324 forks source link

"cog run script/download-weights" fails to download for me. says exit code 137. How can i fix this? (i am very new to this so any help would be great) #83

Open lleopard1704 opened 1 year ago

lleopard1704 commented 1 year ago

@lleopard1704 ➜ /workspaces/cog-stable-diffusion (main) $ cog run script/download-weights Building Docker image from environment in cog.yaml... [+] Building 80.7s (16/17)
=> [internal] load build definition from Dockerfile 0.2s => => transferring dockerfile: 1.89kB 0.0s => [internal] load .dockerignore 0.2s => => transferring context: 34B 0.0s => resolve image config for docker.io/docker/dockerfile:1.2 1.6s => [auth] docker/dockerfile:pull token for registry-1.docker.io 0.0s => CACHED docker-image://docker.io/docker/dockerfile:1.2@sha256:e2a8561e419ab1ba6b2fe6cbdf49fd92b95912df1cf7d313c3e2230a333fdbcc 0.0s => [internal] load metadata for docker.io/nvidia/cuda:11.6.0-cudnn8-devel-ubuntu20.04 1.0s => [auth] nvidia/cuda:pull token for registry-1.docker.io 0.0s => [stage-0 1/9] FROM docker.io/nvidia/cuda:11.6.0-cudnn8-devel-ubuntu20.04@sha256:6a4ef3d0032001ab91e0e6ecc27ebf59dd122a531703de8f64cc84 0.0s => [internal] load build context 0.1s => => transferring context: 42.24kB 0.0s => CACHED [stage-0 2/9] RUN --mount=type=cache,target=/var/cache/apt set -eux; apt-get update -qq; apt-get install -qqy --no-install-reco 0.0s => CACHED [stage-0 3/9] RUN --mount=type=cache,target=/var/cache/apt apt-get update -qq && apt-get install -qqy --no-install-recommends 0.0s => CACHED [stage-0 4/9] RUN curl -s -S -L https://raw.githubusercontent.com/pyenv/pyenv-installer/master/bin/pyenv-installer | bash && g 0.0s => CACHED [stage-0 5/9] COPY .cog/tmp/build1717844414/cog-0.0.1.dev-py3-none-any.whl /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s => CACHED [stage-0 6/9] RUN --mount=type=cache,target=/root/.cache/pip pip install /tmp/cog-0.0.1.dev-py3-none-any.whl 0.0s => CACHED [stage-0 7/9] COPY .cog/tmp/build1717844414/requirements.txt /tmp/requirements.txt 0.0s => ERROR [stage-0 8/9] RUN --mount=type=cache,target=/root/.cache/pip pip install -r /tmp/requirements.txt 76.5s

[stage-0 8/9] RUN --mount=type=cache,target=/root/.cache/pip pip install -r /tmp/requirements.txt:

16 4.933 Looking in indexes: https://pypi.org/simple, https://download.pytorch.org/whl/cu116

16 6.026 Collecting diffusers==0.11.1

16 6.059 Using cached diffusers-0.11.1-py3-none-any.whl (524 kB)

16 7.201 Collecting torch==1.13.0+cu116

16 7.208 Downloading https://download.pytorch.org/whl/cu116/torch-1.13.0%2Bcu116-cp310-cp310-linux_x86_64.whl (1983.0 MB)

16 74.74 /root/.pyenv/pyenv.d/exec/pip-rehash/pip: line 20: 84 Killed "$PYENV_COMMAND_PATH" "$@"


executor failed running [/bin/sh -c pip install -r /tmp/requirements.txt]: exit code: 137 ⅹ Failed to build Docker image: exit status 1

Harrolee commented 3 weeks ago

I get the same error

Harrolee commented 3 weeks ago

as it turns out, 137 is an out of memory error.

In my case, docker had an upper limit of 8GB defined for memory use. My error was resolved after I increased the max available RAM in Docker Desktop.

Harrolee commented 3 weeks ago

A related error than you might get is no space left on device

If, like me, you are using docker buildx to build an image for a different architecture than your host machine, the device in this case is not your host machine but the docker container responsible for building your image.

You can increase the amount of drive space available for a docker container in the same menu where you configure available RAM.