Closed lukebelbina closed 2 years ago
Docker enviroment was not made by me, contributor @oc013 provided this, I can take a quick look to see if I can spot what's wrong though
Is there a 'src' folder with several other folders inside of it?
I think maybe this line is cleaning those folders, it's not a step that's used in the normal setup
Is there a 'src' folder with several other folders inside of it?
I think maybe this line is cleaning those folders, it's not a step that's used in the normal setup
here are the contents of src in the container:
(ldm) root@33074642d96d:/src# ls
Dockerfile assets environment.yaml notebook_helpers.py setup.py
LICENSE configs ldm optimizedSD src
'Launch Waifu Diffusion.lnk' data ldm.cmd outputs txt2img.yaml
README.md docker-compose.yml main.py run.cmd webui.cmd
Stable_Diffusion_v1_Model_Card.md entrypoint.sh models scripts webuildm.cmd
contents of src\src I should say then this might also be an issue whereby the docker file is calling it's own main folder src which would usually be called stable-diffusion, and there's the other src folder
then, depending on which directory is the working directory when the script is launched, this might cause an issue as well
but I think its that clean line, waiting to see contents of src\src
Here's the contens or src/src
(ldm) root@33074642d96d:/src/src# ls
gfpgan realesrgan
I went and tried a rebuild with the conda clean --all && \
line removed and same issue
so the src/src folder still only has gfgpan and realesrgan?
Ok i've changed the docker files to use /sd/ as the root directory equivalent to stable-diffusion for the regular setup See if that works, it could be some sort of conflict with there being two directories named src
Please show the output of the build when it's failing. Your output was only launching the container, not from building the image where it would have failed.
You could run these commands to get a mostly fresh build:
docker-compose down
docker-compose build --no-cache --progress=plain
It should look something like:
#10 638.3 Running setup.py develop for GFPGAN
#10 638.3 Running setup.py develop for taming-transformers
#10 638.3 Running setup.py develop for realesrgan
#10 638.3 Running setup.py develop for latent-diffusion
#10 638.3 Running setup.py develop for k-diffusion
#10 638.3 Running setup.py develop for clip
#10 638.3 Successfully installed GFPGAN Jinja2-3.1.2 MarkupSafe-2.1.1 PyWavelets-1.3.0 absl-py-1.2.0 accelerate-0.12.0 addict-2.4.0 aiohttp-3.8.1 aiosignal-1.2.0 albumentations-0.4.3 altair-4.2.0 analytics-python-1.4.0 antlr4-python3-runtime-4.8 anyio-3.6.1 async-timeout-4.0.2 attrs-22.1.0 backoff-1.10.0 backports.zoneinfo-0.2.1 basicsr-1.4.1 bcrypt-4.0.0 blinker-1.5 cachetools-5.2.0 chardet-4.0.0 clean-fid-0.1.28 click-8.1.3 clip commonmark-0.9.1 cycler-0.11.0 decorator-5.1.1 docker-pycreds-0.4.0 einops-0.3.0 entrypoints-0.4 facexlib-0.2.4 fastapi-0.81.0 ffmpy-0.3.0 filelock-3.8.0 filterpy-1.4.5 fonttools-4.37.1 frozenlist-1.3.1 fsspec-2022.7.1 ftfy-6.1.1 future-0.18.2 gitdb-4.0.9 gitpython-3.1.27 google-auth-2.11.0 google-auth-oauthlib-0.4.6 gradio-3.1.6 grpcio-1.47.0 h11-0.12.0 httpcore-0.15.0 httpx-0.23.0 huggingface-hub-0.9.1 idna-2.10 imageio-2.9.0 imageio-ffmpeg-0.4.2 imgaug-0.2.6 importlib-metadata-4.12.0 importlib-resources-5.9.0 jsonmerge-1.8.0 jsonschema-4.14.0 k-diffusion kiwisolver-1.4.4 kornia-0.6.0 latent-diffusion linkify-it-py-1.0.3 llvmlite-0.39.0 lmdb-1.3.0 markdown-3.4.1 markdown-it-py-2.1.0 matplotlib-3.5.3 mdit-py-plugins-0.3.0 mdurl-0.1.2 monotonic-1.6 multidict-6.0.2 networkx-2.8.6 numba-0.56.0 oauthlib-3.2.0 omegaconf-2.1.1 opencv-python-4.1.2.30 opencv-python-headless-4.1.2.30 orjson-3.8.0 packaging-21.3 pandas-1.4.3 paramiko-2.11.0 pathtools-0.1.2 pkgutil-resolve-name-1.3.10 promise-2.3 protobuf-3.19.4 psutil-5.9.1 pudb-2019.2 pyDeprecate-0.3.1 pyarrow-9.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycryptodome-3.15.0 pydantic-1.9.2 pydeck-0.8.0b1 pydub-0.25.1 pygments-2.13.0 pympler-1.0.1 pynacl-1.5.0 pynvml-11.4.1 pyparsing-3.0.9 pyrsistent-0.18.1 python-dateutil-2.8.2 python-multipart-0.0.5 pytorch-lightning-1.4.2 pytz-2022.2.1 pytz-deprecation-shim-0.1.0.post0 pyyaml-6.0 realesrgan regex-2022.8.17 requests-2.25.1 requests-oauthlib-1.3.1 resize-right-0.0.2 rfc3986-1.5.0 rich-12.5.1 rsa-4.9 scikit-image-0.19.3 scipy-1.9.1 semver-2.13.0 sentry-sdk-1.9.5 setproctitle-1.3.2 shortuuid-1.0.9 smmap-5.0.0 sniffio-1.2.0 starlette-0.19.1 streamlit-1.12.2 taming-transformers tb-nightly-2.11.0a20220827 tensorboard-2.10.0 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 test-tube-0.7.5 tifffile-2022.8.12 tokenizers-0.12.1 toml-0.10.2 toolz-0.12.0 torch-fidelity-0.3.0 torchdiffeq-0.2.3 torchmetrics-0.6.0 tornado-6.2 tqdm-4.64.0 transformers-4.19.2 tzdata-2022.2 tzlocal-4.2 uc-micro-py-1.0.1 urwid-2.1.2 uvicorn-0.18.3 validators-0.20.0 wandb-0.13.2 watchdog-2.1.9 wcwidth-0.2.5 websockets-10.3 werkzeug-2.2.2 yapf-0.32.0 yarl-1.8.1 zipp-3.8.1
Show the output from here where it's failing to install the missing dependencies. Also please provide output of :
docker version
docker-compose -v
PS I would suggest reverting the changes, the files being in /src should not be an issue. conda clean --all just cleans up cache files to make the docker image smaller.
-a, --all
Remove index cache, lock files, unused cache packages, and tarballs.
I'll test with everything moved to /sd
...
I see the same error now after pulling all the changes, working on it.
sd | Traceback (most recent call last):
sd | File "scripts/webui.py", line 39, in <module>
sd | import k_diffusion as K
sd | ModuleNotFoundError: No module named 'k_diffusion'
sd | Relauncher: Process is ending. Relaunching in 0.5s...
sd | Relauncher: Launching...
Thanks @oc013
From what I saw the k-diffusion directory doesn't exist and there should be another one as well, they are both installed from the environment.yaml, or should be, so I think they are probably getting deleted for some reason
@hlky I think I know what the problem is, I probably need to move the initial setup for the conda env into the entrypoint script as well. The volume that saves the conda env is possibly overwriting the initial setup. Give me a bit I'll work it out.
This would have normally have worked but because the conda env is installing some packages into the /sd/src
directory, when the volume for /sd
later gets mounted I think it overwrites to an incomplete state.
Have you got it pulling changes when the container starts or anything like that? git pull of this repo will pull the stable version, you could probably add an option to switch to dev, check .github/sync.yml on stable-diffusion-webui to see which files are synced are their destinations in this repo There was a change to the webui.cmd that updates the environment from the environment.yaml file if there are any changes, if there are no changes it doesn't do anything, it's safe to run both the conda env create command and the update command without affecting the environment, they only properly run if either the environment doesn't exist, or the environment.yaml differs to the environment's current setup. Just wanted to check you knew about that so could implement it if needed
Once I deleted all the caching I had built in, I realized the issue. I believe this issue should be resolved once the above is merged, my apologies.
@lukebelbina you can run the ./docker-reset.sh file I added or run the individual commands inside of it to fully clear things up and reclaim some diskspace, then docker-compose up and let me know how it goes
@hlky I did see that, just was trying to reduce the start up time so I added a small bit of code to only run those commands if the conda env doesn't exist or if the date modified timestamp on the environment.yaml changes
One bonus is the docker image is now only 2.3GB instead of 8GB
@oc013 and @hlky - thank you both so much for fixing this and your work, confirmed it is now working this morning!
After running docker-compose up
I get the fallowing ModuleNotFoundError but with frontend instead of 'k_diffusion'
Traceback (most recent call last):
sd | File "/sd/scripts/webui.py", line 3, in <module>
sd | from frontend.frontend import draw_gradio_ui
sd | ModuleNotFoundError: No module named 'frontend'
sd | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd | entrypoint.sh: Launching...'
sd | Relaunch count: 135
Running ./docker-reset.sh
or reclone with new image result with same error
@mohammedalsayegh this is not a docker problem, the code is wrong as far as I can tell
it's an issue with the Dockerfile not pulling the changes from the repo properly
the ui was refactored and changes need to be pulled from the main repo
The Dockerfile does not pull the code, it is up to the user to git pull updates from the repo and restart the container which will use the code via a mounted volume. I detailed everything here #93 for future reference should you need to update but it should be a minimal maintenance system other than possibly adding new model files that need to be pre-downloaded.
Working on another project now, but I recloned this repo and ran everything from fresh per the instructions and it works with no problem, my mistake it's the user.
I see now he did not update the code from the repo because it still has that stray single quote in the one message in the console. It's not relevant to this issue he commented on.
@mohammedalsayegh this is not a docker problem, the code is wrong as far as I can tell
Both webui.cmd
and webuildm.cmd
worked with no issue under a conda-built environment, so I thought it was likely a missing something in the Dockerfile.
It works from /home/user_name/
directory in WSL but get the above path error when repository clone placed in /mnt/c/..etc
@mohammedalsayegh I'm not sure what /mnt/c/..etc
is, maybe it's a windows thing
Run this and send the output
docker exec -it sd bash
pwd
@mohammedalsayegh I'm not sure what
/mnt/c/..etc
is, maybe it's a windows thing
In my context, I was referring to any subdirectory of C under WSL with /mnt/c/..etc
. Which it turn out not the case. As soon as Docker and WSL are rebooted, it works.
In spite of deleting the docker volume and image, changing the directory, and cloning again, I have not been able to reproduce the same error. There may have been a problem with WSL accessing Windows directory permissions at runtime.
There was a second error during docker compose up, but it was resolved after multiple attempts of sudo docker compose up
.
[+] Building 4.8s (3/3) FINISHED
=> [internal] load build definition from Dockerfile 0.1s
=> => transferring dockerfile: 731B 0.0s
=> [internal] load .dockerignore 0.1s
=> => transferring context: 2B 0.0s
=> ERROR [internal] load metadata for docker.io/nvidia/cuda:11.3.1-runtime-ubuntu20.04 4.7s
------
> [internal] load metadata for docker.io/nvidia/cuda:11.3.1-runtime-ubuntu20.04:
------
failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to do request: Head "https://registry-1.docker.io/v2/nvidia/cuda/manifests/11.3.1-runtime-ubuntu20.04": EOF
The sd allocated correctly on second run
(base) PS C:\Users\red> docker exec -it sd bash
(ldm) root@25ee48096a4e:/sd# pwd
/sd
Ran
docker-compose up
and getting an error `ModuleNotFoundError: No module named 'k_diffusion'. Full output here: