Godspeed, you magnificent maniac...

Merramore / stable-diffusion-webui-extension-DirectML

Make it work!

3 stars 0 forks source link

Godspeed, you magnificent maniac... #1

Open vt-idiot opened 1 year ago

vt-idiot commented 1 year ago

Is this what I think it is? An extension to make AUTOMATIC1111 webui run using pytorch-directml, and thus potentially on AMD GPUs in Windows instead of under Linux with ROCm and a million asterisks?

Godspeed. I took one look at making it work and decided it wasn't worth the headache of replacing every single torch...cuda call with torch...dml only to later find that something other than pytorch was also CUDA only and getting in my way, or that a function wasn't supported.

...I'll start a bounty at $42.0

MrNehr commented 1 year ago

It work really well with my RX 5600 XT. Just found out about this. It's a life saver. From 5min on CPU to 32sec.

vt-idiot commented 1 year ago

It work really well with my RX 5600 XT. Just found out about this. It's a life saver. From 5min on CPU to 32sec.

You got stable-diffusion-webui running on Windows with an AMD GPU? How?

MrNehr commented 1 year ago

Using this extension and the automatic1111 SD.

vt-idiot commented 1 year ago

Using this extension and the automatic1111 SD.

it already works as is?! there goes my weekend

vt-idiot commented 1 year ago

bah. it works. kind of. instant VIDEO_MEMORY_MANAGEMENT_INTERNAL BSOD when I try to generate anything with an 8GB RX5700

MrNehr commented 1 year ago

I don't know if it's related but i'm using this driver: https://www.amd.com/en/support/kb/release-notes/rn-rad-win-22-11-1-mlir-iree

vt-idiot commented 1 year ago

Whelp, there's an experiment for next weekend. It might be related.

What resolution do you usually generate at? I went straight for 768x512, didn't launch with --medvram or anything like that either.

MrNehr commented 1 year ago

I do 512x512, but i did some 512x816 and it was fine.

edit: you might still need --precision full --no-half (i do have them)

vt-idiot commented 1 year ago

edit: you might still need --precision full --no-half (i do have them)

Yeah I went for --skip-torch-cuda-test --precision full --no-half --no-half-vae to begin with. I'll try it again later with those drivers you recommended and start out smaller.

Does it work on all samplers, or rather, have you used it with particular samplers? DPM++ SDE Karras didn't work (but didn't BSOD) because of the way it tries to access the device. IIRC DPM++ 2s a Karras was what I tried next and that caused the BSOD.

MrNehr commented 1 year ago

Yeah some sampler don't work. Euler a, Euler, Heun, DDIM, DPM++ 2S a, DPM++ 2S a karra work fine. I didn't test them all but i do know some DPM don't work for me.