Closed FrakerKill closed 1 year ago
I think you are using different commandline arguments compared to before.
Add --opt-sub-quad-attention
and try again.
I tried it too but after last updates:
Same here but on 5300m. 3.5s/it to 6.5s/it. tried different optimizations, but result is the same. Also tried clean install. Only speed issue, no video memmory error
Can't reproduce. It generates 768x512 with 1.2s/it(first generation) ~ 1.07it/s for me. (RX 5700 XT) Do you have more than two GPUs?
which commands args have you configured? And with 2.0.0 torch?
I'm using an RX6600. This is a completely clean install, nothing added, all default settings. Adding --opt-sub-quad-attention makes no difference.
I get the memory allocation error with 768x512, but not with 512x512.
My average iteration time is 5.4 seconds. I can get 7.5 seconds with CPU-only.
It was slow for my 6800 until I set these arguments
--precision full --no-half --autolaunch --opt-sub-quad-attention --disable-nan-check --no-half-vae --opt-sdp-attention --opt-split-attention
It is usually slow on first generation but after that I get about 2it/s for 512x512. Not exactly sure what arguments are doing what or ones I don't need but it works for me.
use --opt-sdp-attention, its the same performance as "--opt-sub-quad-attention" and will not generate black images.
which commands args have you configured? And with 2.0.0 torch?
torch 2.0.0 torch-directml 0.2.0.dev230426 --no-half --precision full --opt-sub-quad-attention
It was slow for my 6800 until I set these arguments
--precision full --no-half --autolaunch --opt-sub-quad-attention --disable-nan-check --no-half-vae --opt-sdp-attention --opt-split-attention
It is usually slow on first generation but after that I get about 2it/s for 512x512. Not exactly sure what arguments are doing what or ones I don't need but it works for me.
Perfect, with --opt-sdp-attention
But still a lot of allocating problems (this is 512x512):
We can continue this in #38
Is there an existing issue for this?
What happened?
After new updates, my RX6600 is working for 512x512 in 5-8s/it instead 1.5it/s And later you try to generate another one and can't allocate it by small space: 45MB
RuntimeError: Could not allocate tensor with 4588800 bytes. There is not enough GPU video memory available!
Steps to reproduce the problem
Generating 512x512 in ++2M Karras
What should have happened?
Better speed for this GPU
Commit where the problem happens
WEBUI
What platforms do you use to access the UI ?
Windows
What browsers do you use to access the UI ?
Google Chrome, Microsoft Edge
Command Line Arguments
List of extensions
Console logs
Additional information
No response