Open ormedo opened 1 year ago
Hey!
I would have said just rerun the build command and it will retry again from the last successful (and cached) step, but as you say, you keep getting the same error.
Is it always on the same file, and the same number of bytes? Are you using a proxy?
HI!
Thanks your your time supporting us. I think cloud be a bad file or networking issue. Upgrade to 11.7 version fix the problem
Hi! Just another conflict.
In this case with the python version.
Hey! Unfortunately xformers only has precompiled binaries for a very select list of package version combinations (I have some notes about this at the top of the Dockerfile
). 11.7 won't work. You could try 11.3 though.
P.S. I don't know much about running on diffusers on an M1 beyond that it's possible. You may well need to search docker-diffusers-api
codebase for anywhere I've written cuda
and replace it with mps
. I'll try fix this in a future release so this won't be necessary (you're the first person to try this :sweat_smile:) Please do report on your findings, would love to get this working for all M1 users!
Its works con M1 with 11.3 :D but exited after a few seconds with no visible logs, at last with my knowledge :S
There are the logs inside container.
Traceback (most recent call last):
File "/api/server.py", line 12, in conda run /bin/bash -c python3 -u server.py
failed. (See above for error)
libc10_cuda.so: cannot open shared object file: No such file or directory
WARNING: libc10_cuda.so: cannot open shared object file: No such file or directory
Need to compile C++ extensions to get sparse attention support. Please run python setup.py build develop
Hey, thanks! Logs make it much easier to understand what's going on.
So yeah, as I suspected, unfortunately we're going to have to look for any code that references nvidia's cuda
and remove it if it's not needed, or replace it with mps
where possible, to work on Apple M1.
I would really love to make docker-diffusers-api work out the box with M1, but it's going to be quite a while until I'll have the time to be actively involved here :(
In the meantime, the line in question can be removed entirely (app.py
line 53: device: torch.cuda...
). And you'll need to search through all the files for any other mention of "cuda" and replace it with "mps" (especially anything like .to("cuda")
, device="cuda"
, or anything like that).
Again, I wish I could help more, and look into automatically detecting the right GPU, but I just don't have time at the moment, and really am not sure when I will :( But please keep this issue open, please keep us updated with your progress, and I will take a more active role here when I can. I'll also be available to answer questions to the best of my ability (but I really have zero experience with Apple, unfortunately).
I Understand. I want to test before go on production in Banana's enviroment. But 1 Click installation goes sweet!
Oh, awesome! That's great. Thanks for reporting back about that.. at least you can still play in the meantime :)
I should have a chance to look at this next week... if we're lucky, it will all just work afterwards. Otherwise it will take a lot longer :sweat_smile: Do you know any good places to rent M1's online? I think one of the companies I've used before has them, I'll try to remember :sweat_smile:
AWS allow M1 Mac Mini instances if I remember well
Oh great, thanks!
More future ref stuff for me...
https://pytorch.org/docs/stable/notes/mps.html
https://chrisdare.medium.com/running-pytorch-on-apple-silicon-m1-gpus-a8bb6f680b02
Hi!
I just downladed de proyect and try to build and deploy the docker on my M1. I always get the same error. [+] Building 163.5s (15/44)
=> [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 6.90kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime 1.5s [+] Building 163.6s (15/44)
=> => transferring context: 64.22kB 0.0s => CACHED [base 1/5] FROM docker.io/pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime@sha256:0bc0971dc8ae319af610d493aced87df46255c9508a8b9e9bc365f11a56e7b75 0.0s => [base 2/5] RUN if [ -n "" ] ; then echo quit | openssl s_client -proxy $(echo | cut -b 8-) -servername google.com -connect google.com:443 -showcerts | sed 'H;1h; 0.3s => [base 3/5] RUN apt-get update 14.3s => [base 4/5] RUN apt-get install -yqq git 27.6s [+] Building 1320.8s (18/44)
=> [internal] load build definition from Dockerfile 0.0s => => transferring dockerfile: 6.90kB 0.0s => [internal] load .dockerignore 0.0s => => transferring context: 2B 0.0s => [internal] load metadata for docker.io/pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime 1.5s => [internal] load build context 0.0s => => transferring context: 64.22kB 0.0s => CACHED [base 1/5] FROM docker.io/pytorch/pytorch:1.12.1-cuda11.3-cudnn8-runtime@sha256:0bc0971dc8ae319af610d493aced87df46255c9508a8b9e9bc365f11a56e7b75 0.0s => [base 2/5] RUN if [ -n "" ] ; then echo quit | openssl s_client -proxy $(echo | cut -b 8-) -servername google.com -connect google.com:443 -showcerts | sed 'H;1h; 0.3s => [base 3/5] RUN apt-get update 14.3s => [base 4/5] RUN apt-get install -yqq git 27.6s => [base 5/5] RUN apt-get install -yqq zstd 8.3s => [output 1/32] RUN mkdir /api 0.5s => [patchmatch 1/3] WORKDIR /tmp 0.0s => [patchmatch 2/3] COPY scripts/patchmatch-setup.sh . 0.0s => [patchmatch 3/3] RUN sh patchmatch-setup.sh 0.4s => [output 2/32] WORKDIR /api 0.0s => [output 3/32] RUN conda update -n base -c defaults conda 101.1s => [output 4/32] RUN conda create -n xformers python=3.10 33.9s => [output 5/32] RUN python --version 6.3s => ERROR [output 6/32] RUN conda install -c pytorch -c conda-forge cudatoolkit=11.6 pytorch=1.12.1 1126.9s
14 1110.5 CondaError: Downloaded bytes did not match Content-Length
14 1110.5 url: https://conda.anaconda.org/pytorch/linux-64/pytorch-1.12.1-py3.10_cuda11.6_cudnn8.3.2_0.tar.bz2
14 1110.5 target_path: /opt/conda/pkgs/pytorch-1.12.1-py3.10_cuda11.6_cudnn8.3.2_0.tar.bz2
14 1110.5 Content-Length: 1284916176
14 1110.5 downloaded bytes: 1100035059
14 1110.5
14 1110.5
14 1110.5
14 1126.1 ERROR conda.cli.main_run:execute(47):
conda run /bin/bash -c conda install -c pytorch -c conda-forge cudatoolkit=11.6 pytorch=1.12.1
failed. (See above for error)executor failed running [/opt/conda/bin/conda run --no-capture-output -n xformers /bin/bash -c conda install -c pytorch -c conda-forge cudatoolkit=11.6 pytorch=1.12.1]: exit code: 1
I understand that it's a download problem, but I'm not good at docker to be able to fix this problem.
Any suggestions?