Closed bakkot closed 2 years ago
Update to simply say it works with the web version as well. However, the following part
is restricted by
if (!config.gfpgan_model_exists) {
document.querySelector("#gfpgan").style.display = 'none';
}
from static/dream_web/index.js
In my computer this evaluates to false (and doesn't hide it). However, if someone doesn't have it and config.gfpgan_model_exists
evaluates to false, it will hide the option to upscale.
Since it's more related to faces and not to simply upscaling, maybe it should be modified as well (?)
I absolutely plan on getting this to work because I want it! :D I just haven't had time to work on it.
Thank you, works like a charm !
I do have a question though... for me it doesn't matter if i have -U 4 or -U 2, it always performs a 4 time upscale, is it working for you ?
Also, in case somebody has issues with mac security settings with the realesrgan file, i want to recommend to following article to resolve this issue: https://macpaw.com/how-to/allow-apps-anywhere
BR
@thg-muc Thanks for catching that! I forgot about passing the option to realesrgan-ncnn-vulkan. These are the changes to get it to work.
- def esrgan_resize(self, input):
+ def esrgan_resize(self, input, scale):
- image = self.esrgan_resize(image)
+ image = self.esrgan_resize(image, str(upscale[0]))
- subprocess.run(
['./realesrgan-ncnn-vulkan', '-i', 'esrgan_in.png', '-o', 'esrgan_out.png'],
stdout=subprocess.PIPE
).stdout.decode('utf-8')
+ subprocess.run(
['./realesrgan-ncnn-vulkan', '-i', 'esrgan_in.png', '-o', 'esrgan_out.png', '-s', scale],
stdout=subprocess.PIPE
).stdout.decode('utf-8')
For simplicity, here's the updated file simplet2i.py.zip
If subprocess.run() works similarly on Windows and Linux, then this could be a whole framework for launching post-processing jobs, including ones that aren't native python. This would have the advantage of being both extensible and insulating us from changes in the GFPGAN package.
I see that there are ESRGAN ncnn-vulkan releases for Mac, Ubuntu and Windows. Is there something similar for GFPGAN, or do users have to go through installation from source?
Another question. The subprocess call is using a relative path to the executable. Does run() search the PATH, and is this portable between windows and mac/linux?
I assume subprocess.run()
will work very similarly, but I don't have a Windows or Linux machine to test it.
No executables for GFPGAN in Releases https://github.com/TencentARC/GFPGAN/releases Maybe it could be built, but not readily available.
About subprocess.run() and PATH, not sure. Someone smarter will know :=D
What I will add is I found a problem trying to run realesrgan-ncnn-vulkan
from a different directory than where the executable is (passing relative path), and searching the Issues sections, I found the following comment from @VaslD on https://github.com/xinntao/Real-ESRGAN/issues/379
This problem is because realesrgan-ncnn-vulkan tries to read the pretrained model in the working path ($PWD, %CD%), while the usual practice should be in the standard path (platform standard software installation path, user data path, or the bottom line to the executable). file path) to read the data provided with the application installation.
A temporary solution is to run realesrgan-ncnn-vulkan with the full path to the model, using the -m parameter:
/usr/local/RealESRGAN/realesrgan-ncnn-vulkan -m /usr/local/RealESRGAN/models -i ... -o ... The full path of the model should be spliced with /models in the folder where realesrgan-ncnn-vulkan is located by default. If you put realesrgan-ncnn-vulkan in $PATH, use which under UNIX and where under Windows to get the realesrgan-ncnn-vulkan executable, then get its parent folder and splicing /models as the -m parameter. Can.
Edit: The alternative solution is to cd to the folder where realesrgan-ncnn-vulkan is located as upstairs (because cd has changed the working path), but it is recommended to use pushd instead of cd. After running, popd can return to the original terminal or script. working path.
Also, authors are advised to handle and report exceptions kindly. If the model is not found (models is missing, or -m, -n parameters are wrong), output debugging information and exit with error code instead of triggering [1] 3515 segmentation fault realesrgan-ncnn-vulkan. Otherwise it looks like a careless programming error and doesn't let the caller know that the parameter is wrong.
On ['./realesrgan-ncnn-vulkan', '-i', 'esrgan_in.png', '-o', 'esrgan_out.png']
As an optimization too you can add also the param -n to select the model to use.
You will have:
realesrgan-x4plus
<-- Works very wellrealesrgan-x4plus-anime
<-- a bit better than x4plus when dealing with manga/drawingrealesrnet-x4plus
<-- slower on my macrealesr-animevideov3
In addition, adding realesrgan in the root folder of this project is not the best.
Hence you can move it to a sibling folder and then send subprocess.run
to use that folder with the parameter cwd="../realesrgan"
ex:
subprocess.run(
['./realesrgan-ncnn-vulkan', '-i', 'esrgan_in.png', '-o', 'esrgan_out.png', '-s', scale],
stdout=subprocess.PIPE,
cwd="../realesrgan"
).stdout.decode('utf-8')
Here an implementation that works on my side.
You can download with
wget https://github.com/glonlas/Stable-Diffusion-Apple-Silicon-M1-Install/blob/main/patches/gfpgan_tools.py.patch
wget https://github.com/glonlas/Stable-Diffusion-Apple-Silicon-M1-Install/blob/main/patches/simplet2i.py.patch
git apply gfpgan_tools.py.patch
git apply simplet2i.py.patch
Note: I will do a PR tomorrow, this will be cleaner.
Ok, so I tried your code, and it runs
"superman dancing with a panda bear" -s50 -W512 -H512 -C7.5 -Aplms -G0.4 -U 2.0 0.6 -S2516680777
, but something happens with GPFGAN or RealESRGAN
I get this image before
and after, it transforms it into
Face restoration alone (without upscaling) seems to fail as well. Are you using an Intel Mac? I'm on M1.
dream> "superman dancing with a panda bear" -s50 -W512 -H512 -C7.5 -Aplms -G0.4 -S2516680778
Generating: 0%| | 0/1 [00:00<?, ?it/s] DEBUG: seed at make_image() invocation time =2516680778
PLMS Sampler: 100%|| 50/50 [00:32<00:00, 1.54it/s]
Generating: 100%|| 1/1 [00:33<00:00, 33.07s/it]
>> GFPGAN - Restoring Faces: superman dancing with a panda bear : seed:2516680778
[W NNPACK.cpp:51] Could not initialize NNPACK! Reason: Unsupported hardware.
Intel MKL FATAL ERROR: This system does not meet the minimum requirements for use of the Intel(R) Math Kernel Library.
The processor must support the Intel(R) Supplemental Streaming SIMD Extensions 3 (Intel(R) SSSE3) instructions.
The processor must support the Intel(R) Streaming SIMD Extensions 4.2 (Intel(R) SSE4.2) instructions.
The processor must support the Intel(R) Advanced Vector Extensions (Intel(R) AVX) instructions.
Are you using an Intel Mac? I'm on M1. Yes I am on M1 as well (2020 13" M1).
I did not get this issue at all. Are you on Main or Development branch? I am using the main branch from Sep 4, 2022.
The issue is certainly because my script forces '-n', 'realesrgan-x4plus'
which is made for x4 scaling. I need to make it take care of the -U
value you are passing.
I could not reproduce the issue, but I will not be surprised it is the root cause for this issue.
I'm using the main branch.
Yes, 4x works.
Okay, so a hack to get 2x to work, is to remove'-n', 'realesrgan-x4plus'
I guess we could remove it -and use the default- unless a specific model is passed.
The default seems to be realesr-animevideov3 (not sure which is better, vs. realesrgan-x4plus)
-n model-name model name (default=realesr-animevideov3, can be realesr-animevideov3 | realesrgan-x4plus | realesrgan-x4plus-anime | realesrnet-x4plus)
The PR is coming. It will be easier for a review and to create a patch if needed. Much better than this 20 min hack.
@glonlas do you also intend to update the macos README? it has no mention of realesrgan at the moment. Would be super nice!
@Any-Winter-4079 the patch has been updated. It handles both x2 and x4. I am writing PR for the development branch, but since this branch when through a refactoring the PR will not be compatible with the main branch.
So if you are using the main branch please use the command in my previous comment (it is fixed and fully woking)
Here the PR for the development
branch: https://github.com/lstein/stable-diffusion/pull/424
May I have a review or +1? Thank you!
cc: @Any-Winter-4079, @lstein
I don't have write access, but I hope @lstein or a contributor see the Pull request and accept it. From my end, I've tested it and added my review. Hope it gets available soon!
So, I had a quick look at getting RealESRGAN working on MPS and it was only a two line change (and a pip install realesrgan) I did have GFPGAN up and running before that is that makes any difference.
git diff
diff --git a/ldm/gfpgan/gfpgan_tools.py b/ldm/gfpgan/gfpgan_tools.py
index ff90a83..40ccc97 100644
--- a/ldm/gfpgan/gfpgan_tools.py
+++ b/ldm/gfpgan/gfpgan_tools.py
@@ -75,7 +75,7 @@ def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4):
def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
if bg_upsampler == 'realesrgan':
- if not torch.cuda.is_available(): # CPU
+ if not torch.cuda.is_available() and not torch.backends.mps.is_available(): # CPU
warnings.warn(
'The unoptimized RealESRGAN is slow on CPU. We do not use it. '
'If you really want to use it, please modify the corresponding codes.'
@@ -119,7 +119,7 @@ def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400):
tile=bg_tile,
tile_pad=10,
pre_pad=0,
- half=True,
+ half=torch.cuda.is_available(),
) # need to set False in CPU mode
else:
bg_upsampler = None
That is interesting. Will try to test this later today. Certainly it is simple and requires very little code change.
One side effect of pip install realesrgan
is that it forces a lower numpy version if my memory doesn't fail me.
For example, without realesrgan
, I have:
numpy 1.23.2 py39h3668e8b_0 conda-forge
Haven't tested if this version downgrade has performance implications though.
I've have a peak see what version of numpy I've got installed when the M1 is finished running the preflight checks, I can't recall it downgrading it, but I probably wasn't watching it properly either.
% pip list | grep real realesrgan 0.2.5.0 % pip list | grep numpy numpy 1.23.2
I've had a quick look as Real-ESRGAN's git repo and the requirement.txt just lists numpy with no version.
So, I had a quick look at getting RealESRGAN working on MPS and it was only a two line change (and a pip install realesrgan) I did have GFPGAN up and running before that is that makes any difference.
git diff diff --git a/ldm/gfpgan/gfpgan_tools.py b/ldm/gfpgan/gfpgan_tools.py index ff90a83..40ccc97 100644 --- a/ldm/gfpgan/gfpgan_tools.py +++ b/ldm/gfpgan/gfpgan_tools.py @@ -75,7 +75,7 @@ def _run_gfpgan(image, strength, prompt, seed, upsampler_scale=4): def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400): if bg_upsampler == 'realesrgan': - if not torch.cuda.is_available(): # CPU + if not torch.cuda.is_available() and not torch.backends.mps.is_available(): # CPU warnings.warn( 'The unoptimized RealESRGAN is slow on CPU. We do not use it. ' 'If you really want to use it, please modify the corresponding codes.' @@ -119,7 +119,7 @@ def _load_gfpgan_bg_upsampler(bg_upsampler, upsampler_scale, bg_tile=400): tile=bg_tile, tile_pad=10, pre_pad=0, - half=True, + half=torch.cuda.is_available(), ) # need to set False in CPU mode else: bg_upsampler = None
It may not matter, but based on watching CPU and GPU usage while running this, I'm guessing this is falling back to CPU and not using MPS (assuming I'm understanding MPS correctly as an alternate PyTorch device that can use GPU cores on Apple Silicon.) I think there would need to be changes to GFPGAN and Real-ESRGAN to implement MPS as a backend device for them to run on the GPU. I installed GFPGAN and Real-ESRGAN by cloning their git repos and running the setup scripts inside the ldm conda environment.
Perhaps working on MPS was not quite the way to but it but the code was activity blocking anything that wasn't CUDA before. Not idea if they are actually using the MPS pytorch back end but f the code inside GPFGAN / Real-ESRGAN themselves are neutrally coded it should by default.
Unfortunately it doesn't look like they wrote Real-ESRGAN neutrally as its eating CPU at this moment. Neither is GPFGAN, but thats pretty quick anyway
I'm seeing times of 49s to upscale x4 while the realesrgan-ncnn-vulkan
executable takes 1.5s so it makes a lot of sense this may be happening?
If so, we may need to reopen the pr from @glonlas (which also allows 3x and better quality)
Or can we use mps with the current version with some fix?
Interestingly its a simple hack to the Real-ESRGAN code to get Real-ESRGAN running on the GPU and it runs without an error, the results are wrong though
FYI, I just merged in a PR for the "Embiggen" feature, which does arbitrary upscaling by tiling the image, stretching the tiles and then merging them back together using img2img. It does have a realesrgan step, but that can be turned off and it seems to work pretty well nevertheless.
On Mon, Sep 12, 2022 at 4:01 PM Vargol @.***> wrote:
Interesting;y its a simple hack to the Real-ESRGAN to get Real-ESRGAN running on the GPU and it runs without an error, the results are wrong though
— Reply to this email directly, view it on GitHub https://github.com/lstein/stable-diffusion/issues/390#issuecomment-1244327922, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAA3EVPMQ6TZ6J5ZJSGM2ODV56D2DANCNFSM6AAAAAAQFESFPI . You are receiving this because you were mentioned.Message ID: @.***>
Are you referencing PR #474? Does this need to come out, or are you talking about a modification?
@lstein I just tried a clean install. It does work. I have two question/remarks
realesrgan-ncnn-vulkan
is faster, but the tile method looks to be working.Imo, the last step to do is to put this
wget https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth src/gfpgan/experiments/pretrained_models/
inside the macos guide.
Since I didn't know when to do that download, I did it too early, so that the GFPGANv1.3.pth didnt go in the proper place (I think the src
folder didnt exist when I did it)
Embiggen sounds fun from a technical point of view but to fess up there's a legally licensed copy of TopazLab's GigaPixel AI on my M1 so the resizing isn't a hot topic for actual use for me ;-)
The PR's is a alternative where a process call to a Real-ESRGAN's binary is used rather than the PyPi or git code
This was part of @blessedcoolant's original design. My understanding is that GFPGAN works better on larger faces, so the upscaling is done before the face reconstruction. It might be worth swapping the order of operations in the run_gfpgan() method to confirm this.
The PR's is a alternative where a process call to a Real-ESRGAN's binary is used rather than the PyPi or git code
Oh, I remember that one. The only thing I didn't like about it were concerns it might not be a portable solution across the platforms. Do you know if it runs on Linux/Windows?
This was part of @blessedcoolant's original design. My understanding is that GFPGAN works better on larger faces, so the upscaling is done before the face reconstruction. It might be worth swapping the order of operations in the run_gfpgan() method to confirm this.
Yep. GFPGAN depends on facial landmarks being detectable which is a lot easier to do on larger resolution images. I've used GFPGAN and other face restoration tools for a very long time now and I have found it better to always perform scaling up and then face restoration.
There are always cases where it is irredeemable but otherwise, it works great.
GFPGAN might actually be a one line change for MPS use. It's a bit hard to tell as it runs to quickly for Activity Monitor to pick it up at 512x512 doing a 1024x1024 run to see if make it run for long enough.
I ran GFPGAN on 2048x2048 and it's extremely fast. It does indeed gives very good results.
On the other hand, the upscaler runs on the CPU is very slow (slower than the image generation) (at least for x4).
I used to run realesrgan-ncnn-vulkan
on Metal and it was very fast. Might be worth to try to use realesrgan-ncnn-vulkan
instead if it gives the same quality @lstein .
Tremendous work anyway, finally everything work.
The PR's is a alternative where a process call to a Real-ESRGAN's binary is used rather than the PyPi or git code
Oh, I remember that one. The only thing I didn't like about it were concerns it might not be a portable solution across the platforms. Do you know if it runs on Linux/Windows?
It supports Windows/Linux/MacOS https://github.com/xinntao/Real-ESRGAN/releases
The only thing is that GFPGAN doesn't have an executable. So we'd need to run Real-ESRGAN from executable and GFPGAN from source. Not a big deal, but it adds a bit more variety/complexity to the code. On the flip side, performance.
I ran GFPGAN on 2048x2048 and it's extremely fast. It does indeed gives very good results. On the other hand, the upscaler runs on the CPU is very slow (slower than the image generation) (at least for x4). I used to run
realesrgan-ncnn-vulkan
on Metal and it was very fast. Might be worth to try to userealesrgan-ncnn-vulkan
instead if it gives the same quality @lstein . Tremendous work anyway, finally everything work.
Exactly. Same experience here. Personally, I have realesrgan-ncnn-vulkan
set up as well, so I use it instead of this one (b/c it's 30x slower), so it's not a big deal for me either, but might be for other Mac users.
I'd say GFPGAN runs fast enough not to matter really, despite that I'm waiting see what the MPS change does (as far as I can tell it works, its enhancing faces, just can't tell if its actually using the GPU)
I'd say GFPGAN runs fast enough not to matter really, despite that I'm waiting see what the MPS change does (as far as I can tell it works, its enhancing faces, just can't tell if its actually using the GPU)
Agree.
By the way can't that change be applied to Real-ESRGAN as well? Haven't had much time to look into this part of the code.
I'm afraid not, I've applied it and Real-ESRGAN runs, its fast and on the GPU but a output is totally broken
"This is not the image you're looking for...."
I ran GFPGAN on 2048x2048 and it's extremely fast. It does indeed gives very good results. On the other hand, the upscaler runs on the CPU is very slow (slower than the image generation) (at least for x4). I used to run
realesrgan-ncnn-vulkan
on Metal and it was very fast. Might be worth to try to userealesrgan-ncnn-vulkan
instead if it gives the same quality @lstein . Tremendous work anyway, finally everything work.Exactly. Same experience here. Personally, I have
realesrgan-ncnn-vulkan
set up as well, so I use it instead of this one (b/c it's 30x slower), so it's not a big deal for me either, but might be for other Mac users.
did you install realesrgan-ncnn-vulkan
yourself? Is there anything special for macusers to know? I don't know how to install it manually myself as I used it in another project that installed it automatically.
@slk333 Download from here https://github.com/xinntao/Real-ESRGAN/releases
Then unzip, cd into the folder, run chmod u+x realesrgan-ncnn-vulkan
With that you should be set (you can test with some input.png and output.png that are already there)
To run ./realesrgan-ncnn-vulkan -i input.png -o output.png
Note you'll have to click Run anyway or something similar from System Preferences - Security and Privacy
Darn it GFPGAN most have some other 'cuda or cpu' code embedded rather than just guarding at the start like Real-ERGAN I used a 4x scaled 512x512 but didn't see any GPU usage in the post processing in Activity Monitor, it might have been too quick to see but its looking increasing unlikely
I moved this comment by @eduardoarinopelegrin from here. Original comment follows.
Kind of unrelated, but also kind of related (b/c it involves simplet2ti.py). Has anyone got
realesrgan
working on M1/M2 Macs with this repo? If not, are there any plans or works in progress?I just got it to work with the
realesrgan-ncnn-vulkan
executable + updating the appropriate parts of simplet2ti.py. So if there were no plans to implement / not in the near future, maybe you could include this hack for us M1/M2 Mac users to get it to work.Steps are:
realesrgan-ncnn-vulkan-20220424-macos
) and moverealesrgan-ncnn-vulkan
insidestable-diffusion
(this project folder). Move the Real-ESRGAN model files fromrealesrgan-ncnn-vulkan-20220424-macos/models
intostable-diffusion/models
as well.chmod u+x realesrgan-ncnn-vulkan
to allow it to be run.self.device.type == 'mps'
The changes are:import subprocess
Add a function to do the upscaling
Modify this part to call the function
This is the full file. This may be more appropriate as a pull request, but since I've never done one (so far!) and simplet2ti.py is being modified (from the version I implemented my changes on), maybe it's easier to add it to your version if you see it fit. simplet2i.py.zip
Usage:
python3 ./scripts/dream.py
dream >Anubis the Ancient Egyptian God of Death riding a motorbike in Grand Theft Auto V cover, with palm trees in the background, cover art by Stephen Bliss, artstation, high quality -U 4 -m ddim -S 1469565
Obtained/ expected result (4x upscale):