lukebelbina commented 2 years ago

Ran docker-compose up and getting an error `ModuleNotFoundError: No module named 'k_diffusion'. Full output here:

$ docker-compose up
[+] Running 1/0
 ⠿ Container sd  Created                                                                                                                         0.0s
Attaching to sd
sd  | Validating model files...
sd  | checking model.ckpt...
sd  | model.ckpt is valid!
sd  | 
sd  | checking GFPGANv1.3.pth...
sd  | GFPGANv1.3.pth is valid!
sd  | 
sd  | checking RealESRGAN_x4plus.pth...
sd  | RealESRGAN_x4plus.pth is valid!
sd  | 
sd  | checking RealESRGAN_x4plus_anime_6B.pth...
sd  | RealESRGAN_x4plus_anime_6B.pth is valid!
sd  | 
sd  | Relauncher: Launching...
sd  | Traceback (most recent call last):
sd  |   File "scripts/webui.py", line 36, in <module>
sd  |     import k_diffusion as K
sd  | ModuleNotFoundError: No module named 'k_diffusion'
sd  | Relauncher: Process is ending. Relaunching in 0.5s...
sd  | Relauncher: Launching...
sd  |   Relaunch count: 1
sd  | Traceback (most recent call last):
sd  |   File "scripts/webui.py", line 36, in <module>
sd  |     import k_diffusion as K
sd  | ModuleNotFoundError: No module named 'k_diffusion'
sd  | Relauncher: Process is ending. Relaunching in 0.5s...

hlky commented 2 years ago

Docker enviroment was not made by me, contributor @oc013 provided this, I can take a quick look to see if I can spot what's wrong though

hlky commented 2 years ago

Is there a 'src' folder with several other folders inside of it?

I think maybe this line is cleaning those folders, it's not a step that's used in the normal setup

lukebelbina commented 2 years ago

Is there a 'src' folder with several other folders inside of it?

I think maybe this line is cleaning those folders, it's not a step that's used in the normal setup

here are the contents of src in the container:

(ldm) root@33074642d96d:/src# ls
 Dockerfile                          assets               environment.yaml   notebook_helpers.py   setup.py
 LICENSE                             configs              ldm                optimizedSD           src
'Launch Waifu Diffusion.lnk'         data                 ldm.cmd            outputs               txt2img.yaml
 README.md                           docker-compose.yml   main.py            run.cmd               webui.cmd
 Stable_Diffusion_v1_Model_Card.md   entrypoint.sh        models             scripts               webuildm.cmd

hlky commented 2 years ago

contents of src\src I should say then this might also be an issue whereby the docker file is calling it's own main folder src which would usually be called stable-diffusion, and there's the other src folder

then, depending on which directory is the working directory when the script is launched, this might cause an issue as well

but I think its that clean line, waiting to see contents of src\src

lukebelbina commented 2 years ago

Here's the contens or src/src

(ldm) root@33074642d96d:/src/src# ls
gfpgan  realesrgan

lukebelbina commented 2 years ago

I went and tried a rebuild with the conda clean --all && \ line removed and same issue

hlky commented 2 years ago

so the src/src folder still only has gfgpan and realesrgan?

hlky commented 2 years ago

Ok i've changed the docker files to use /sd/ as the root directory equivalent to stable-diffusion for the regular setup See if that works, it could be some sort of conflict with there being two directories named src

oc013 commented 2 years ago

Please show the output of the build when it's failing. Your output was only launching the container, not from building the image where it would have failed.

You could run these commands to get a mostly fresh build:

docker-compose down
docker-compose build --no-cache --progress=plain

It should look something like:

#10 638.3   Running setup.py develop for GFPGAN
#10 638.3   Running setup.py develop for taming-transformers
#10 638.3   Running setup.py develop for realesrgan
#10 638.3   Running setup.py develop for latent-diffusion
#10 638.3   Running setup.py develop for k-diffusion
#10 638.3   Running setup.py develop for clip
#10 638.3 Successfully installed GFPGAN Jinja2-3.1.2 MarkupSafe-2.1.1 PyWavelets-1.3.0 absl-py-1.2.0 accelerate-0.12.0 addict-2.4.0 aiohttp-3.8.1 aiosignal-1.2.0 albumentations-0.4.3 altair-4.2.0 analytics-python-1.4.0 antlr4-python3-runtime-4.8 anyio-3.6.1 async-timeout-4.0.2 attrs-22.1.0 backoff-1.10.0 backports.zoneinfo-0.2.1 basicsr-1.4.1 bcrypt-4.0.0 blinker-1.5 cachetools-5.2.0 chardet-4.0.0 clean-fid-0.1.28 click-8.1.3 clip commonmark-0.9.1 cycler-0.11.0 decorator-5.1.1 docker-pycreds-0.4.0 einops-0.3.0 entrypoints-0.4 facexlib-0.2.4 fastapi-0.81.0 ffmpy-0.3.0 filelock-3.8.0 filterpy-1.4.5 fonttools-4.37.1 frozenlist-1.3.1 fsspec-2022.7.1 ftfy-6.1.1 future-0.18.2 gitdb-4.0.9 gitpython-3.1.27 google-auth-2.11.0 google-auth-oauthlib-0.4.6 gradio-3.1.6 grpcio-1.47.0 h11-0.12.0 httpcore-0.15.0 httpx-0.23.0 huggingface-hub-0.9.1 idna-2.10 imageio-2.9.0 imageio-ffmpeg-0.4.2 imgaug-0.2.6 importlib-metadata-4.12.0 importlib-resources-5.9.0 jsonmerge-1.8.0 jsonschema-4.14.0 k-diffusion kiwisolver-1.4.4 kornia-0.6.0 latent-diffusion linkify-it-py-1.0.3 llvmlite-0.39.0 lmdb-1.3.0 markdown-3.4.1 markdown-it-py-2.1.0 matplotlib-3.5.3 mdit-py-plugins-0.3.0 mdurl-0.1.2 monotonic-1.6 multidict-6.0.2 networkx-2.8.6 numba-0.56.0 oauthlib-3.2.0 omegaconf-2.1.1 opencv-python-4.1.2.30 opencv-python-headless-4.1.2.30 orjson-3.8.0 packaging-21.3 pandas-1.4.3 paramiko-2.11.0 pathtools-0.1.2 pkgutil-resolve-name-1.3.10 promise-2.3 protobuf-3.19.4 psutil-5.9.1 pudb-2019.2 pyDeprecate-0.3.1 pyarrow-9.0.0 pyasn1-0.4.8 pyasn1-modules-0.2.8 pycryptodome-3.15.0 pydantic-1.9.2 pydeck-0.8.0b1 pydub-0.25.1 pygments-2.13.0 pympler-1.0.1 pynacl-1.5.0 pynvml-11.4.1 pyparsing-3.0.9 pyrsistent-0.18.1 python-dateutil-2.8.2 python-multipart-0.0.5 pytorch-lightning-1.4.2 pytz-2022.2.1 pytz-deprecation-shim-0.1.0.post0 pyyaml-6.0 realesrgan regex-2022.8.17 requests-2.25.1 requests-oauthlib-1.3.1 resize-right-0.0.2 rfc3986-1.5.0 rich-12.5.1 rsa-4.9 scikit-image-0.19.3 scipy-1.9.1 semver-2.13.0 sentry-sdk-1.9.5 setproctitle-1.3.2 shortuuid-1.0.9 smmap-5.0.0 sniffio-1.2.0 starlette-0.19.1 streamlit-1.12.2 taming-transformers tb-nightly-2.11.0a20220827 tensorboard-2.10.0 tensorboard-data-server-0.6.1 tensorboard-plugin-wit-1.8.1 test-tube-0.7.5 tifffile-2022.8.12 tokenizers-0.12.1 toml-0.10.2 toolz-0.12.0 torch-fidelity-0.3.0 torchdiffeq-0.2.3 torchmetrics-0.6.0 tornado-6.2 tqdm-4.64.0 transformers-4.19.2 tzdata-2022.2 tzlocal-4.2 uc-micro-py-1.0.1 urwid-2.1.2 uvicorn-0.18.3 validators-0.20.0 wandb-0.13.2 watchdog-2.1.9 wcwidth-0.2.5 websockets-10.3 werkzeug-2.2.2 yapf-0.32.0 yarl-1.8.1 zipp-3.8.1

Show the output from here where it's failing to install the missing dependencies. Also please provide output of :

docker version
docker-compose -v

PS I would suggest reverting the changes, the files being in /src should not be an issue. conda clean --all just cleans up cache files to make the docker image smaller.

-a, --all
    Remove index cache, lock files, unused cache packages, and tarballs.

oc013 commented 2 years ago

I'll test with everything moved to /sd...

oc013 commented 2 years ago

I see the same error now after pulling all the changes, working on it.

sd  | Traceback (most recent call last):
sd  |   File "scripts/webui.py", line 39, in <module>
sd  |     import k_diffusion as K
sd  | ModuleNotFoundError: No module named 'k_diffusion'
sd  | Relauncher: Process is ending. Relaunching in 0.5s...
sd  | Relauncher: Launching...

hlky commented 2 years ago

Thanks @oc013

From what I saw the k-diffusion directory doesn't exist and there should be another one as well, they are both installed from the environment.yaml, or should be, so I think they are probably getting deleted for some reason

oc013 commented 2 years ago

@hlky I think I know what the problem is, I probably need to move the initial setup for the conda env into the entrypoint script as well. The volume that saves the conda env is possibly overwriting the initial setup. Give me a bit I'll work it out.

This would have normally have worked but because the conda env is installing some packages into the /sd/src directory, when the volume for /sd later gets mounted I think it overwrites to an incomplete state.

hlky commented 2 years ago

Have you got it pulling changes when the container starts or anything like that? git pull of this repo will pull the stable version, you could probably add an option to switch to dev, check .github/sync.yml on stable-diffusion-webui to see which files are synced are their destinations in this repo There was a change to the webui.cmd that updates the environment from the environment.yaml file if there are any changes, if there are no changes it doesn't do anything, it's safe to run both the conda env create command and the update command without affecting the environment, they only properly run if either the environment doesn't exist, or the environment.yaml differs to the environment's current setup. Just wanted to check you knew about that so could implement it if needed

oc013 commented 2 years ago

Once I deleted all the caching I had built in, I realized the issue. I believe this issue should be resolved once the above is merged, my apologies.

@lukebelbina you can run the ./docker-reset.sh file I added or run the individual commands inside of it to fully clear things up and reclaim some diskspace, then docker-compose up and let me know how it goes

@hlky I did see that, just was trying to reduce the start up time so I added a small bit of code to only run those commands if the conda env doesn't exist or if the date modified timestamp on the environment.yaml changes

One bonus is the docker image is now only 2.3GB instead of 8GB

hlky commented 2 years ago

59

lukebelbina commented 2 years ago

@oc013 and @hlky - thank you both so much for fixing this and your work, confirmed it is now working this morning!

mohammedalsayegh commented 2 years ago

After running docker-compose up I get the fallowing ModuleNotFoundError but with frontend instead of 'k_diffusion'

Traceback (most recent call last):
sd  |   File "/sd/scripts/webui.py", line 3, in <module>
sd  |     from frontend.frontend import draw_gradio_ui
sd  | ModuleNotFoundError: No module named 'frontend'
sd  | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd  | entrypoint.sh: Launching...'
sd  | Relaunch count: 135

Running ./docker-reset.sh or reclone with new image result with same error

oc013 commented 2 years ago

@mohammedalsayegh this is not a docker problem, the code is wrong as far as I can tell

hlky commented 2 years ago

it's an issue with the Dockerfile not pulling the changes from the repo properly

the ui was refactored and changes need to be pulled from the main repo

oc013 commented 2 years ago

The Dockerfile does not pull the code, it is up to the user to git pull updates from the repo and restart the container which will use the code via a mounted volume. I detailed everything here #93 for future reference should you need to update but it should be a minimal maintenance system other than possibly adding new model files that need to be pre-downloaded.

Working on another project now, but I recloned this repo and ran everything from fresh per the instructions and it works with no problem, my mistake it's the user.

I see now he did not update the code from the repo because it still has that stray single quote in the one message in the console. It's not relevant to this issue he commented on.

mohammedalsayegh commented 2 years ago

@mohammedalsayegh this is not a docker problem, the code is wrong as far as I can tell

Both webui.cmd and webuildm.cmd worked with no issue under a conda-built environment, so I thought it was likely a missing something in the Dockerfile.

mohammedalsayegh commented 2 years ago

It works from /home/user_name/ directory in WSL but get the above path error when repository clone placed in /mnt/c/..etc

oc013 commented 2 years ago

@mohammedalsayegh I'm not sure what /mnt/c/..etc is, maybe it's a windows thing

Run this and send the output

docker exec -it sd bash
pwd

mohammedalsayegh commented 2 years ago

@mohammedalsayegh I'm not sure what /mnt/c/..etc is, maybe it's a windows thing

In my context, I was referring to any subdirectory of C under WSL with /mnt/c/..etc. Which it turn out not the case. As soon as Docker and WSL are rebooted, it works.

In spite of deleting the docker volume and image, changing the directory, and cloning again, I have not been able to reproduce the same error. There may have been a problem with WSL accessing Windows directory permissions at runtime.

There was a second error during docker compose up, but it was resolved after multiple attempts of sudo docker compose up.

[+] Building 4.8s (3/3) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                   0.1s
 => => transferring dockerfile: 731B                                                                                                                   0.0s
 => [internal] load .dockerignore                                                                                                                      0.1s
 => => transferring context: 2B                                                                                                                        0.0s
 => ERROR [internal] load metadata for docker.io/nvidia/cuda:11.3.1-runtime-ubuntu20.04                                                                4.7s
------
 > [internal] load metadata for docker.io/nvidia/cuda:11.3.1-runtime-ubuntu20.04:
------
failed to solve: rpc error: code = Unknown desc = failed to solve with frontend dockerfile.v0: failed to create LLB definition: failed to do request: Head "https://registry-1.docker.io/v2/nvidia/cuda/manifests/11.3.1-runtime-ubuntu20.04": EOF

The sd allocated correctly on second run

(base) PS C:\Users\red> docker exec -it sd bash
(ldm) root@25ee48096a4e:/sd# pwd
/sd

Sygil-Dev / stable-diffusion

docker-compose up error: No module named 'k_diffusion' #45

59