Closed psychedelicious closed 2 years ago
i3oc9i also found that the github workflow is stuck with the same error: https://github.com/invoke-ai/InvokeAI/actions/runs/3094254306/jobs/5007435455
Ok, so the scripts
directory is not being added to sys.path
when running python scripts/dream.py
. It also affects running other scripts e.g. python backend/server.py
.
Manually adding the scripts
directory (e.g. sys.path.append("/Users/spencer/Documents/Code/stable-diffusion/")
) or ""
(sys.path.append("")
) fixes it.
I have tried to figure out what changed to cause this issue but am at a loss.
Resident pythonista @tildebyte ?
It's not happening on my linux system, but this issue has come up before in a Windows-specific fashion.
Two things to try:
pip install -e .
(don't forget the dot at the end)sys.path.append('.')
to the line right after import sys
at the very top.I have no freakin' idea why that commit got so messed up. The original PR was just a typographical fix to a single file, but instead ended up pulling in changes from @psychedelicious 's cleanup of the web stuff. I am learning to not click on the PR button that says "update branch" when I see the message "This branch is out-of-date with the base branch." I think the right thing to do is to rebase - would like @tildebyte 's advice too.
~pip install -e .
did fix my environment, thanks!~
Edit: It worked for one run of the script, but then it stopped working. I had to move the sys.path.append('.')
up before and local imports.
Unfortunately the issue is reproducible and happens on a fresh conda install after that problematic commit. What could have changed to cause this issue?
Moving the sys.path.append('.')
also would work but given that hasn't changed when then problematic commit occurred, I feel like moving it is bypassing an issue somewhere else.
I think the key clue here is thatpip install -e .
worked once, and then it failed on subsequent tries. Something that happens during that first run of the dream script is stably modifying the local environment. I can't imagine what it could be.
One way to help reduce the problem search space is to first confirm that pip install -e .
fixes the problem once. Then try launching the backend/server.py script and see if the problem occurs there as well. If it does, we can assume that whatever the problem is, it is occurring in the common modules shared by the dream script and the web backend.
It also might be helpful to insert print(sys.path)
at various strategic locations in the code. In my environment, before running dream.py or anything like that, it looks like this:
['', '/u/lstein/projects/SD/stable-diffusion', '/usr/share/pyshared', '/u/lstein/.conda/envs/ldm/lib/python39.zip',
'/u/lstein/.conda/envs/ldm/lib/python3.9', '/u/lstein/.conda/envs/ldm/lib/python3.9/lib-dynload',
'/u/lstein/.conda/envs/ldm/lib/python3.9/site-packages', '/u/lstein/projects/SD/stable-diffusion/src/gfpgan',
'/u/lstein/projects/SD/stable-diffusion/src/clip', '/u/lstein/projects/SD/stable-diffusion/src/taming-transformers',
'/u/lstein/projects/SD/stable-diffusion/src/k-diffusion']
I bet at some point during the first run of dream.py, the entry that points at the stable-diffusion directory will disappear.
Hmm. The scripts worked once then I got the same error. After that pip install -e .
didn't work. I wiped out all my envs and started fresh from development and pip install -e .
didn't fix the error. Maybe wiping out my local copy of the repo and starting totally fresh will affect it.
Anyways, I'll revisit the issue tomorrow. For now I've just moved the sys.path.append(".")
above all of the ldm imports.
I've done a bit of checking with git bisect, it seems 7b0cbb34d618098b4072f14870937ee9eb4369a1 causes this issue
EDIT: Tried narrowing it down more, by my python knowledge is roughly 0, so I'll leave that to someone who speaks python ;)
I'm a bit hamstrung here because everything's hunky-dory on my Linux system. If worst comes to worst, I'll back out all the changes to https://github.com/invoke-ai/InvokeAI/commit/7b0cbb34d618098b4072f14870937ee9eb4369a1 and reconstruct. It's a pity, because there were a lot of new features there, including improvements to the WebUI and outpainting.
Have any of the Mac users experienced this regression?
yep, I'm on a mac (m1 mba); I'm having another look at it later today
This is what comes of reading bug reports late at night. I totally lost track of the fact that this was reported on a Mac system. I got fixated on Windows in some way. Apologies.
One big difference between my environment and the Mac environment is that I'm using Python 3.9 and the Mac environment is 3.10. I will try 3.10 and see if I can reproduce.
Can any Windows users confirm that this bug appears on their systems?
Wait, ldm.gfpgan.gfpgan_tools was removed, should that just be ldm.restoration.gfpgan.gfpgan orso?
EDIT: ok, nvm, bit out of my league here wrt python.
Both from ldm.gfpgan.gfpgan_tools import real_esrgan_upscale
and from ldm.gfpgan.gfpgan_tools import run_gfpgan
do not exists anymore as the gfpgan_tools.py is gone, but are used in server.py. Perhaps some changes got lost?
I believe that gfpgan_tools was refactored and is no longer needed, but I'm checking to make sure that this is the case.
That's an uncaught bug in server.py, and I'll fix.
UPDATE: Which server.py? Is it backend/server.py or ldm/server.py?
sorry! it's in backend/server.py. Usages:
https://github.com/invoke-ai/InvokeAI/blob/19174949b6eafe57d576633d4e2c6979e8cc03a9/backend/server.py#L653 https://github.com/invoke-ai/InvokeAI/blob/19174949b6eafe57d576633d4e2c6979e8cc03a9/backend/server.py#L207 https://github.com/invoke-ai/InvokeAI/blob/19174949b6eafe57d576633d4e2c6979e8cc03a9/backend/server.py#L635
And the imports: https://github.com/invoke-ai/InvokeAI/blob/19174949b6eafe57d576633d4e2c6979e8cc03a9/backend/server.py#L21
@holstvoogd Yes that's expected, backend/server.py
is being updated now to use the new restoration module
Ah, yeah, I see now. nevermind all my comments then. I hadn't noticed this was actually not about backend/server.py 🤦♂️
Oh, I was just testing my fixes to backend/server.py. I will wait for @psychedelicious to commit his PR and work on dream/server.py instead. I do have it working if you want it. The only problem is that I had to hardcode constants for the locations of the GFPGAN directory, etc, because I wasn't sure where they come from in @psychedelicious 's code.
Here's the diff in case it is useful. The first bit is just changes needed to connect on my firewalled system.
diff --git a/backend/server.py b/backend/server.py
index 11d6c61..9302859 100644
--- a/backend/server.py
+++ b/backend/server.py
@@ -18,9 +18,8 @@ from threading import Event
from uuid import uuid4
from send2trash import send2trash
-from ldm.gfpgan.gfpgan_tools import real_esrgan_upscale
-from ldm.gfpgan.gfpgan_tools import run_gfpgan
from ldm.generate import Generate
+from ldm.dream.restoration import Restoration
from ldm.dream.pngwriter import PngWriter, retrieve_metadata
from ldm.dream.args import APP_ID, APP_VERSION, calculate_init_img_hash
from ldm.dream.conditioning import split_weighted_subprompts
@@ -34,11 +33,12 @@ USER CONFIG
output_dir = "outputs/" # Base output directory for images
# host = 'localhost' # Web & socket.io host
-host = "localhost" # Web & socket.io host
+host = "0.0.0.0" # Web & socket.io host
port = 9090 # Web & socket.io port
verbose = False # enables copious socket.io logging
additional_allowed_origins = [
- "http://localhost:5173"
+ "http://localhost:5173",
+ "http://localhost:9090",
] # additional CORS allowed origins
model = "stable-diffusion-1.4"
@@ -46,12 +46,15 @@ model = "stable-diffusion-1.4"
END USER CONFIG
"""
+# Face Restoration constants that need to be replaced by user configuration
+GFPGAN_DIR = './src/gfpgan'
+GFPGAN_MODEL_PATH = 'experiments/pretrained_models/GFPGANv1.3.pth'
+ESRGAN_BG_TILE = 400
"""
SERVER SETUP
"""
-
# fix missing mimetypes on windows due to registry wonkiness
mimetypes.add_type("application/javascript", ".js")
mimetypes.add_type("text/css", ".css")
@@ -204,13 +207,15 @@ def handle_run_esrgan_event(original_image, esrgan_parameters):
socketio.emit("progressUpdate", progress)
eventlet.sleep(0)
- image = real_esrgan_upscale(
+ # this could be done at initialization time
+ restoration = Restoration(GFPGAN_DIR,GFPGAN_MODEL_PATH,ESRGAN_BG_TILE)
+ esrgan = restoration.load_ersgan()
+ image = esrgan.process(
image=image,
upsampler_scale=esrgan_parameters["upscale"][0],
strength=esrgan_parameters["upscale"][1],
seed=seed,
)
-
progress["currentStatus"] = "Saving image"
socketio.emit("progressUpdate", progress)
eventlet.sleep(0)
@@ -275,7 +280,10 @@ def handle_run_gfpgan_event(original_image, gfpgan_parameters):
socketio.emit("progressUpdate", progress)
eventlet.sleep(0)
- image = run_gfpgan(
+ # this could be done at initialization time
+ restoration = Restoration(GFPGAN_DIR,GFPGAN_MODEL_PATH,ESRGAN_BG_TILE)
+ gfpgan = restoration.load_gfpgan()
+ image = gfpgan.process(
image=image,
strength=gfpgan_parameters["gfpgan_strength"],
seed=seed,
The legacy server application doesn't have this problem because it relies on generate() to run the upscaling tools.
I've found a solution for the issue with script/dream.py:
@@ -47,7 +47,7 @@ def main():
# Loading Face Restoration and ESRGAN Modules
try:
gfpgan, codeformer, esrgan = None, None, None
- from ldm.dream.restoration import Restoration
+ from ldm.dream.restoration.base import Restoration
restoration = Restoration(opt.gfpgan_dir, opt.gfpgan_model_path, opt.esrgan_bg_tile)
if opt.restore:
gfpgan, codeformer = restoration.load_face_restore_models()
Restoration was moved from ldm/restoration/restoration.py to ldm/dream/restoration/base.py, that seems to cause this.
Ok, Sorry for the confusion earlier! I was a bit to eager to help & doing other work at the same time.
Now, I've taken time to look closely at what I am actually doing & I can confirm that:
python scripts/dream.py
is still broken for new enviroments on the latest development commitI'm not sure, but this feels like a conda bug tbh. Anyway, removing pyproject.toml fixes the issues with ModuleNotFoundError: No module named 'ldm'
Great detective work! Thanks for tracking down the path issue to pyproject.toml
. I share your puzzlement. Are you 100% sure that removing this file and running conda env update
fixes the problem completely? Perhaps @tildebyte can shed some light on this. Perhaps there is an interaction between conda and this file that I'm not aware of.
I'm happy to work around the problem for now just by adding the sys.path('.')
line to the top of dream.py
. bakend/server.py
already does this. Long run I want to understand why the module loading path is getting screwed up. Did you ever try printing the contents of sys.path using print(sys.path)
? The first or second entry should be the absolute pathname of the InvokeAI (or stable-diffusion) directory. If it's not, then some interaction with pyproject.toml
must be occurring that alters it.
PR #732, which was just committed to development, should fix python scripts/dream.py
. Please report if it doesn't. There is a bug tracking issue #619 specifically set up for reporting WebGUI bugs.
Yes, I've tried several times to be sure :) It seems pyproject.toml is set to replace setup.py. So conda ignores(?) setup.py when it sees the pyproject.toml & since the pyproject.toml has no build config etc, it breaks. I've take a quick look at migrating setup.py and it is supposed to be super easy, but I couldn't figure it out tbh.
As for print(sys.path)
just before the erroring line:
['/Users/arthur/Projects/SD/stable-diffusion/scripts', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python310.zip', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.10', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.10/lib-dynload', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.10/site-packages', '/Users/arthur/Projects/SD/stable-diffusion/src/taming-transformers', '/Users/arthur/Projects/SD/stable-diffusion/src/clip', '/Users/arthur/Projects/SD/stable-diffusion/src/gfpgan']
And with the toml removed:
['/Users/arthur/Projects/SD/stable-diffusion/scripts', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python310.zip', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.10', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.10/lib-dynload', '/opt/homebrew/Caskroom/miniconda/base/envs/ldm/lib/python3.10/site-packages', '/Users/arthur/Projects/SD/stable-diffusion/src/taming-transformers', '/Users/arthur/Projects/SD/stable-diffusion', '/Users/arthur/Projects/SD/stable-diffusion/src/clip', '/Users/arthur/Projects/SD/stable-diffusion/src/gfpgan']
Since that is not very readable: In the former case the stable-diffusion directory was not included, in the latter it is. Otherwise the paths are the same
backend/server.py works again with that PR merged! 👍 thanks for all the hard work on this!
Got distracted, sorry: TL;DR - @lstein please revert the pyproject.toml commit
conda is doing something "convenient" (HUGE "airquotes") again, i.e. processing the pyproject.toml and taking action based on its mere existence, even thought there are no directives in there for it.
FWIW I'm still working on dropping conda, but I'm still seeing unreproducible-on-my-end reports of pip/pew not working properly even on Windows...
FTR, I've built and rebuilt all of my stable and dev venvs locally on Windows 11 with Python3.10 using pip/pew since I can't even remember how long ago... without major incident. Occasionally I hit a dumb typo or something, but it. just. works. here. This isn't to point fingers at users and say "your fault", but rather to express my frustration at not being able to repro install issues locally...
OK. I'm going to rename pyproject.toml
to pyproject.toml.hide
. Hidden behaviors are very frustrating. I wonder if this has something to do with the renaming of the repository?
@lstein;
I wonder if this has something to do with the renaming of the repository?
Almost definitely not. It's some weird behavior of conda trying to use the 'pyproject.toml' (which can be used similarly to 'requirements.txt' or 'environment.yaml') during the install...
Goodness. That was a saga. Poor little Toml, didn’t know the chaos he sowed. Thanks for the in depth troubleshooting @holstvoogd !
I'm having the same error message. How can I fix it?
Still the same issue for fresh installs. The basic documentation installation step-by-step is broken. This is serious.
Describe your environment
Describe the bug On a fresh environment, running
python scripts/dream.py
fails:Reported by i3oc9i on discord, who found that the issue appears at this commit and did much of the troubleshooting: e33ed45cfc030a8a454fcc49d0b6ebf991a7e079 -Delete redundant backquotes #665
I had an environment from before that commit. With an updated development branch commit 810112577fb0ec352a9f51b6e796cbdcef9fcac4 , I ran
mamba env update -f environment-mac.yaml
and now my environment is broken.Making a new env after checking out a commit prior to the problematic one, issue no longer occurs.
- protobuf==3.20.1
toenvironment-mac.yaml
. Removing that line and making a fresh environment - leaving everything else the same - does not fix the issuedream.py
to the project root, it works.import ldm.dream.readline
, you get the same error referring to the next line.import ldm.dream.readline
just fineSo it seems that the current working directory is somehow not passed to
dream.py
.I haven't the slightest clue what could cause this.