Closed gymdreams8 closed 11 months ago
the problem is that i don't have m2 platform available for testing, so i can't reproduce the issue and really move forward - and community work on m2 has been slow at best. and this kind of a problem really requires a lot of tracing/debugging, even if the fix at the end may be just a single line.
i'd love to support m1/m2 better, but my hands are tied at the moment.
Understood — please let me know if there is anything that I can help with. This Github profile I’m using to write is semi-anon but I’m a Python developer and if you can point me to the potential issue, I can try to debug it myself.
– except the challenge for me is that I don’t use tensorflow / possibly would take me a while to read through this library. and of course I don’t actually know the inner workings of the stable diffusion tech (ha)
BUT what I can possibly do is to see if I get the same problem from the non-forked version of Automatic1111 (which I haven’t tried yet). But if it doesn’t, would it potentially be something that would help you solve the issue?
changes between sdnext and original are too great at by now, checking that would not help.
first thing would be to get a much deeper traceback, search for max_frames
and extra_lines
and triple those values and see what pops out.
fyi, root cause is that some part of the model is clearly running in fp32 and part is running in fp16. normally that is not a problem as autocast
automatically adjusts, but autocast is broken on m1/m2 in torch itself, so all parts must be manually aligned.
alternatively, you can force everything to fp32 and it will work. if we cannot get to root cause, i might just do that (e.g. if platform is m1/m2, i can force fp32), but that comes at performance cost.
Sounds good. I will see if I can look into it. I have never read this source so I can’t promise anything. Also somewhat busy at work so no promises that I would be able to look into it as timely as you respond to my issue.
Also, thanks for the suggestion for the workaround.
i have this bug too, but on original webui version, it run fine on my macos
@sukualam interesting… I just cloned the original a1111 version and I don’t have any issue. The only thing that I had to do was to add --no-half
to the A1111 user startup otherwise it throws errors.
./websui-user.sh
export COMMANDLINE_ARGS="--no-half"
And then I also added optimization by installing my own version of pytorch, using the helpful instruction from comfyUI (i don’t think that this is necessary but in my experience it does seem to speed things up on M1/M2 (I am on M2Max):
Quoting verbatim here: https://github.com/comfyanonymous/ComfyUI
You can install ComfyUI in Apple Mac silicon (M1 or M2) with any recent macOS version
Install pytorch nightly. For instructions, read the Accelerated PyTorch training on Mac Apple Developer guide (make sure to install the latest pytorch nightly).
I use pyenv so I installed mine through pip:
pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
I distinctly recalled that this step was necessary for running automatic1111 a long time ago, and that it solved a lot of issues once I do, though I don’t know if that‘s still required or not.
@gymdreams8 thats EXACTLY what i said earlier except in sd.next you set it in settings and in a1111 you use cmd line flag --no-half
.
thats EXACTLY what i said earlier except in sd.next you set it in settings and in a1111 you use cmd line flag --no-half.
@vladmandic hey sorry I still haven’t found time to go through your suggestion but may I ask which part in your earlier message is corresponding to the --no-half
settings? I don’t however think that my error had anything to do with that. Does it?
I am unfamiliar with any of these source codes as I have never looked into them. It doesn’t help that I haven’t even worked with pytorch
before so it would take some time for me to see what max_frames
and extra_lines
are ever about.
I promise i will get to it!
may I ask which part in your earlier message is corresponding to the --no-half settings
--no-half
doesn't exist as cmd flag in sd.next, it was moved to settings months ago and in settings its called "Use full precision for model (--no-half)"
which is what i was referring to when i earlier said
alternatively, you can force everything to fp32 and it will work
I was seeing this same issue on an M1 MBP. It was resolved by installing pytorch nightly (as mentioned by @gymdreams8 )
$ uname -mprsv
Darwin 21.6.0 Darwin Kernel Version 21.6.0: Mon Aug 22 20:19:52 PDT 2022; root:xnu-8020.140.49~2/RELEASE_ARM64_T6000 arm64 arm
$ pip3 install --pre torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/nightly/cpu
Collecting torch
Downloading https://download.pytorch.org/whl/nightly/cpu/torch-2.1.0.dev20230811-cp39-none-macosx_11_0_arm64.whl
Collecting torchvision
Downloading https://download.pytorch.org/whl/nightly/cpu/torchvision-0.16.0.dev20230811-cp39-cp39-macosx_11_0_arm64.whl (1.6 MB)
Collecting torchaudio
Downloading https://download.pytorch.org/whl/nightly/cpu/torchaudio-2.1.0.dev20230811-cp39-cp39-macosx_11_0_arm64.whl (1.8 MB)
Collecting sympy
Using cached sympy-1.12-py3-none-any.whl (5.7 MB)
Collecting fsspec
Using cached fsspec-2023.6.0-py3-none-any.whl (163 kB)
Collecting jinja2
Downloading https://download.pytorch.org/whl/nightly/Jinja2-3.1.2-py3-none-any.whl (133 kB)
Collecting networkx
Using cached networkx-3.1-py3-none-any.whl (2.1 MB)
Collecting typing-extensions
Using cached typing_extensions-4.7.1-py3-none-any.whl (33 kB)
Collecting filelock
Using cached filelock-3.12.2-py3-none-any.whl (10 kB)
Collecting pillow!=8.3.*,>=5.3.0
Using cached Pillow-10.0.0-cp39-cp39-macosx_11_0_arm64.whl (3.1 MB)
Collecting requests
Using cached requests-2.31.0-py3-none-any.whl (62 kB)
Collecting numpy
Using cached numpy-1.25.2-cp39-cp39-macosx_11_0_arm64.whl (14.0 MB)
Collecting MarkupSafe>=2.0
Using cached MarkupSafe-2.1.3-cp39-cp39-macosx_10_9_universal2.whl (17 kB)
Collecting urllib3<3,>=1.21.1
Using cached urllib3-2.0.4-py3-none-any.whl (123 kB)
Collecting certifi>=2017.4.17
Using cached certifi-2023.7.22-py3-none-any.whl (158 kB)
Collecting idna<4,>=2.5
Downloading https://download.pytorch.org/whl/nightly/idna-3.4-py3-none-any.whl (61 kB)
Collecting charset-normalizer<4,>=2
Using cached charset_normalizer-3.2.0-cp39-cp39-macosx_11_0_arm64.whl (124 kB)
Collecting mpmath>=0.19
Using cached mpmath-1.3.0-py3-none-any.whl (536 kB)
I also needed to set Use full precision for model (--no-half)
Did both: Full Precision switch and PyTorch nightly and still cannot generate anything: "Input type (c10::Half) and bias type (float) should be the same". Never had that runtime error issue with Vlad/automatic nor with automatic 1111.
Mac M1
@MajorGruberth
Did both: Full Precision switch and PyTorch nightly and still cannot generate anything: "Input type (c10::Half) and bias type (float) should be the same". Never had that runtime error issue with Vlad/automatic nor with automatic 1111.
If you followed the convo above, you’ll see that the Full Precision switch and PyTorch nightly
will ONLY fix the Automatic1111 issue. It won’t fix the vlad issue. The Vlad issue is separate, which is what this issue is about.
But as Vlad mentioned, his fork now contains too many other things that are too different from Automatic1111 so what fixed A1111 will not fix the Vlad issue.
@fotoetienne which issue did it fix? Automatic1111 or Vlad? Because that solution only fixes Automatic1111 main branch, and does not fix Vlad’s fork. I have for now just use A1111 because I need to be able to continue to run my tasks, but for this Vlad issue, I don’t think that it is the same issue, since that issue started from some updates happening after June, and my environment already had the those nightly applied.
Both issues got fixed overnight, I wonder how... Today auto 1111 was running smoothly as well as Next
@MajorGruberth Well, the pytorch nightly will make everything faster because it’s optimized. But how does that fix Next??? Is it from the latest? I haven’t done a pull since reporting this issue…
SDnext is tested with torch nightly, I upgrade it every few weeks (not really every day, but close enough)
Hi!
It works when I disabled the use of fp16 like this:
sd_models.py
, line 458
def repair_config(sd_config):
if "use_ema" not in sd_config.model.params:
sd_config.model.params.use_ema = False
if shared.opts.no_half:
sd_config.model.params.unet_config.params.use_fp16 = False
elif shared.opts.upcast_sampling:
sd_config.model.params.unet_config.params.use_fp16 = False # CHANGED
if getattr(sd_config.model.params.first_stage_config.params.ddconfig, "attn_type", None) == "vanilla-xformers" and not shared.xformers_available:
sd_config.model.params.first_stage_config.params.ddconfig.attn_type = "vanilla"
This emulates the behavior of the –no-half parameter.
but that changes upcast sampling so its same as no-half - what's the point of that, can't you just use no-half instead?
I don’t know what you have done with this, but I have just verified that everything is working again on a fresh install and new virtualenv. I wrote an article with all my steps:
https://docs.gymdreams8.com/mac_sdnext.html
The main thing is that I did install Pytorch Nightly for Apple Silicon before running SD.Next for the first time, but otherwise everything is working out of the box without throwing an error.
I don’t know if you also install Pytorch Nightly when SD.Next first starts, and if it does, I’ll just remove the instructions, though I don’t see if it would hurt for me to install anyway.
The steps are based on the latest commit, which is https://github.com/vladmandic/automatic/commit/81129cc4b7e451701d6d5ed2127424e8f4ac6685
Since this resolves my issue, should I close it? It’s not an issue for me anymore, but I see that there are active discussions here, so I’m not sure. Let me know.
i made a change that i hoped would help - glad it did. re: torch. sdnext does not install nightly build, it installes latest released. but if you install torch yourself, it will detect it and try to use it, not force reinstall - so you installing nightly build first in this case is a good thing. (btw, thanks for link to your install docs)
i'll close the issue since majority of the remaining thread is a lot of noise - if issue persists for some other users, lets start clean.
but that changes upcast sampling so its same as no-half - what's the point of that, can't you just use no-half instead?
Translation to English:
I tried the --no-half
parameter from the CLI but it didn't work, I forced it in the code to see if it could help. I see that it's already managed and works without needing the parameter, thanks!! :)
@vladmandic Btw, I know that this is no longer an issue, but since recently I had to install ComfyUI for SDXL (to use a very extensive workflow that’s insanely good (SeargeSDXL), I saw this part of the doc that you might find interesting:
Launch ComfyUI by running
python main.py --force-fp16
. Note that --force-fp16 will only work if you installed the latest pytorch nightly.
https://github.com/comfyanonymous/ComfyUI
You talked about fp16 / fp32 above, and it would seem to imply that for Macs, this is only possible if Pytorch Nightly is installed.
Issue Description
I have not been able to run this on OSX with the latest update. I don’t know which version broke the install, because I haven’t pulled an update since June. I asked on Discord for help and after showing my debug logs, was asked to create an issue here.
Steps to reproduce:
Because the log is quite long, I have put it in a gist:
https://gist.github.com/gymdreams8/c12c88b5f608f886f0d8d08005b07612
Workarond:
Right now, in order for me to continue to run, I have checked out an earlier version. The only reason I even found a version that worked for me is because I recalled that on bootup it always showed me that diffusers are not in the latest version and would fetch on boot up.
So I searched for the last version that has that from the repo:
Found the commit hash:
https://github.com/vladmandic/automatic/commit/c90e9965c7b6b4d90bb3d63e3c58352309228e5c
And just used this version for now. That’s quite some commits in the past, but perhaps this would at least help you narrow down a possible issue.
If you let me know which versions contain fairly serious breaking changes, I can test a list of commits and let you know. I just don’t want to go through every one of the commits myself blindly.
Thanks very much! If you wish to chat with me directly, my username on Discord is
gymdreams
without discriminator (new style username)Version Platform Description
Relevant log output
Acknowledgements