ChrisAcrobat commented 2 years ago

DRAFT: Confirming the solution.

ChrisAcrobat commented 2 years ago

📢 Discussion from #59 continues here.

ChrisAcrobat commented 2 years ago

@hlky @oc013 Looks like its crashing here. I don't see any logs from webui.py, are they hidden somewhere?

sd                  | entrypoint.sh: Launching...'
sd                  | Relaunch count: 9
sd                  | Loaded GFPGAN
sd                  | Loaded RealESRGAN with model RealESRGAN_x4plus
sd                  | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd                  | Global Step: 470000
sd                  | LatentDiffusion: Running in eps-prediction mode
sd                  | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd                  | /sd/entrypoint.sh: line 89:   774 Killed                  python -u scripts/webui.py
sd                  | entrypoint.sh: Launching...'
sd                  | Relaunch count: 10
sd                  | Loaded GFPGAN
sd                  | Loaded RealESRGAN with model RealESRGAN_x4plus
sd                  | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd                  | Global Step: 470000
sd                  | LatentDiffusion: Running in eps-prediction mode
sd                  | /sd/entrypoint.sh: line 89:   798 Killed                  python -u scripts/webui.py
sd                  | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd                  | entrypoint.sh: Launching...'
sd                  | Relaunch count: 11
sd                  | Loaded GFPGAN
sd                  | Loaded RealESRGAN with model RealESRGAN_x4plus
sd                  | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd                  | Global Step: 470000
sd                  | LatentDiffusion: Running in eps-prediction mode
sd                  | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd                  | /sd/entrypoint.sh: line 89:   822 Killed                  python -u scripts/webui.py
sd                  | entrypoint.sh: Launching...'

oc013 commented 2 years ago

As far as I know what you see in STDOUT there is the logs. I'm rebuilding now to give you a comparison of what should happen on a successful first launch. Can you scroll back and see if there are any errors downloading things?

oc013 commented 2 years ago

I'm assuming everything up to the actual launch point was successful since I see in your output that it got past loading the model files without an error:

sd  | entrypoint.sh: Launching...
sd  | Downloading: "https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth" to /opt/conda/envs/ldm/lib/python3.8/site-packages/facexlib/weights/detection_Resnet50_Final.pth
sd  | 
100%|██████████| 104M/104M [00:05<00:00, 19.4MB/s] 
sd  | Downloading: "https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth" to /opt/conda/envs/ldm/lib/python3.8/site-packages/facexlib/weights/parsing_parsenet.pth
sd  | 
100%|██████████| 81.4M/81.4M [00:04<00:00, 19.5MB/s]
sd  | Loaded GFPGAN
sd  | Loaded RealESRGAN with model RealESRGAN_x4plus
sd  | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd  | Global Step: 470000
sd  | LatentDiffusion: Running in eps-prediction mode
sd  | DiffusionWrapper has 859.52 M params.
sd  | making attention of type 'vanilla' with 512 in_channels
sd  | Working with z of shape (1, 4, 32, 32) = 4096 dimensions.
sd  | making attention of type 'vanilla' with 512 in_channels
Downloading: 100%|██████████| 1.59G/1.59G [01:30<00:00, 18.9MB/s]
sd  | Running on local URL:  http://localhost:7860/
sd  | 
sd  | To create a public link, set `share=True` in `launch()`.

PS That extra single quote I left in the launching msg really bugs me :laughing:

hlky commented 2 years ago

This is ready @ChrisAcrobat @oc013 ?

ChrisAcrobat commented 2 years ago

Haven't checked the last comments yet (timezone). I'll do it on the train or something. I'll reply later!

ChrisAcrobat commented 2 years ago

@oc013 I'm assuming everything up to the actual launch point was successful since I see in your output that it got past loading the model files without an error:

I think it looks so too. 🙂 log

oc013 commented 2 years ago

This PR changes the scripts/webui.py @hlky

$ git stash
warning: CRLF will be replaced by LF in scripts/webui.py.

ChrisAcrobat commented 2 years ago

Yeah, that wasn't ment to be merged. Did I accidentally disable the Draft-status?

ChrisAcrobat commented 2 years ago

@hlky: Remove this line: https://github.com/hlky/stable-diffusion/blob/main/.gitattributes#L2

oc013 commented 2 years ago

I think either remove .gitattributes and users of the repository should be expected to have their git client setup as I linked in the previous discussion, so they get what is exactly in the repository

git config --global core.autocrlf input

Or the files in the repository should be updated to be consistent to allow .gitattributes to do its thing and result in a clean state on everyone's local machine.

ChrisAcrobat commented 2 years ago

Doing git config --global core.autocrlf input will effect every (future?) user repo, in which can cause other problems. But sure, adding .gitattributes to .gitignore could be a possible fix, but the .sh files (as in my intend PR 5007fdc96f7b6bdb33a56b0a03f0765127e8e585) should never ever use CRLF, right?

oc013 commented 2 years ago

Someone made a PR that will make everything consistent #91

Regarding your problem I'm not sure what's going wrong yet but I did start a general discussion here #93 on docker that includes some Windows specific info I found. Can you verify that you've completed everything there?

ChrisAcrobat commented 2 years ago

Yes, I have just redid all the steps in #93 from scratch with no noticeable difference. I after the reinstall also tried closing the container and then restart it, and then I noticed where it crashed. It was maybe obvious for you, but I now see that it crashes somewhere here:

sd  | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd  | Global Step: 470000
sd  | LatentDiffusion: Running in eps-prediction mode

Full log after the second docker compose up, after the first attempt returned the same result as previous (similar or equal to this):

[+] Running 1/0
 - Container sd  Created                                                                                           0.0s
Attaching to sd
sd  |      active environment : ldm
sd  |     active env location : /opt/conda/envs/ldm
sd  | Validating model files...
sd  | checking model.ckpt...
sd  | model.ckpt is valid!
sd  |
sd  | checking GFPGANv1.3.pth...
sd  | GFPGANv1.3.pth is valid!
sd  |
sd  | checking RealESRGAN_x4plus.pth...
sd  | RealESRGAN_x4plus.pth is valid!
sd  |
sd  | checking RealESRGAN_x4plus_anime_6B.pth...
sd  | RealESRGAN_x4plus_anime_6B.pth is valid!
sd  |
sd  | entrypoint.sh: Launching...'
sd  | Loaded GFPGAN
sd  | Loaded RealESRGAN with model RealESRGAN_x4plus
sd  | Loading model from models/ldm/stable-diffusion-v1/model.ckpt
sd  | Global Step: 470000
sd  | LatentDiffusion: Running in eps-prediction mode
sd  | entrypoint.sh: Process is ending. Relaunching in 0.5s...
sd  | /sd/entrypoint.sh: line 89:    29 Killed                  python -u scripts/webui.py
sd  | entrypoint.sh: Launching...'
sd  | Relaunch count: 1

oc013 commented 2 years ago

Can you try the following to see if you get any more info:

Leave the one terminal open so the container is up
Open a new command prompt
Type docker exec -it sd bash
Type python -u scripts/webui.py

ChrisAcrobat commented 2 years ago

(ldm) root@519ae7e8a662:/sd# python -u scripts/webui.py
Loaded GFPGAN
Loaded RealESRGAN with model RealESRGAN_x4plus
Loading model from models/ldm/stable-diffusion-v1/model.ckpt
Global Step: 470000
LatentDiffusion: Running in eps-prediction mode
Killed

oc013 commented 2 years ago

Ok just making sure it wasn't somehow the bash script killing the python script.

The next message should be DiffusionWrapper has 859.52 M params.

I'm not 100% sure at this point but it's possibly memory that's an issue? What's your hardware look like? In the docker exec -it sd bash can you run nvidia-smi and see your gpus there?

https://stackoverflow.com/questions/65935028/python-script-gets-killed https://stackoverflow.com/questions/19189522/what-does-killed-mean-when-processing-a-huge-csv-with-python-which-suddenly-s

ChrisAcrobat commented 2 years ago

nvidia-smi returns:

Mon Aug 29 13:24:20 2022
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 515.57       Driver Version: 516.59       CUDA Version: 11.7     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:01:00.0  On |                  N/A |
| N/A   59C    P0    31W /  N/A |   2200MiB /  8192MiB |      1%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|    0   N/A  N/A        84      C   /python3.8                      N/A      |
+-----------------------------------------------------------------------------+

@oc013 It now also displayed python3.8, which it didn't do before. I was probably too quick the last time. So I also tried python -u scripts/webui.py again and now got this;

Traceback (most recent call last):
  File "/sd/scripts/webui.py", line 3, in <module>
    from frontend.frontend import draw_gradio_ui
ModuleNotFoundError: No module named 'frontend'

oc013 commented 2 years ago

93 Please see Updating section

hlky commented 2 years ago

Added #93 to the wiki https://github.com/hlky/stable-diffusion/wiki/Docker-Guide

ChrisAcrobat commented 2 years ago

Updating shouldn't be the issue, because I made a clean install with first purging Docker. But I will certainly try it again!

oc013 commented 2 years ago

This is new front end code they changed, maybe cloned at a bad time. You can verify if it's there or not at frontend/frontend.py

ChrisAcrobat commented 2 years ago

There was a frontend/frontend.py in the container, but I didn't look that closely. I have purge it now and is reinstalling.

ChrisAcrobat commented 2 years ago

I tried now again, but no luck. Still the same. It crashes somewhere after the log sd | LatentDiffusion: Running in eps-prediction mode and sd | DiffusionWrapper has 859.52 M params.. This maybe doesn't mean much, but cmdr2/stable-diffusion-ui is working fine for me through Docker.

ChrisAcrobat commented 2 years ago

I'm opening a issue, it doesn't make since to have the communication in a closed PR. 🙂

ChrisAcrobat commented 2 years ago

By the way, thank you very much to both of you, @oc013 and @hlky for your help so far!

Sygil-Dev / stable-diffusion

Create .gitattributes #71

93 Please see Updating section