Closed cactus24556 closed 9 months ago
torch-directml
can't release/collect/empty memory because its tensor implementation inherits OpaqueTensorImpl which can't have storage.
import torch
import torch_directml
device = torch_directml.device()
torch.tensor(1.0).storage() # [torch.storage.TypedStorage(dtype=torch.float32, device=cpu) of size 1]
torch.tensor(1.0).to(device).storage() # NotImplementedError: Cannot access storage of OpaqueTensorImpl
But onnxruntime-directml
partially does release memory. If you want something like that, you can use Olive+ONNX implementation instead.
Thanks for the response. I'm not sure I want to commit with the olive+onnx. As I would have to optimize all the models I use when some commits ago I could generate images just fine without any modifications.
managed to generate images by using --medvram --medvram-sdxl --opt-split-attention --no-half-vae --disable-nan-check. i guess that's the trick. Even when all the VRAM is full I can generate an image right after I made another image.
I confirm that I encounter the exact same issue, though using the provided flags doesn't help in my case. My setup is:
The project manages to generate the first picture for me too. During that process the loaded VRAM is gradually filled to 97% in about 15 seconds. It never gets decreased after that. There is always a slight stall in the generation process, of about 10 seconds at exactly 50%, probably due to the upscaler. After the first picture is generated, the next generation unconditionally fails at a random part of the process, but always the first half. So essentially it's a single use tool, at this point. After the first generation it has to be reloaded in a most literal way: stopping and relaunching the script.
Does such problem happen only for AMDs? Is there any prediction for a fix? (I understand that it's related to a different project)
Yes, this is something that AMD users have to find workarounds for while Nvidia users generally have an easier time from what I understand.
Here is a great discussion about ways to avoid running out of Vram and having it crash after running it once
You should know: this does not happen only for AMD. This will happen for ANY GPUs with torch-directml
.
Originally, we should use ROCm, a low-level toolkit like CUDA, but we use DirectML because it didn't support Windows (now partially does). This alone, there should not be a memory issue, but there was an issue where torch-directml
, a plugin library for PyTorch, inherited OpaqueTensorImpl, making it impossible to track the memory of the Tensor.
I don't know if its developers are interested in fixing such issue with torch-directml
, but if you want to avoid memory issue at this point, you should use onnxruntime-directml
(implemented with --onnx
in this repository, but it will be slow if you don't optimize models with Olive), or wait until PyTorch has full support on ROCm.
Thanks for such a detailed reply, @lshqqytiger. From this I see my options are following:
onnxruntime-directml
project in conjunction with this oneI'll try the Linux and ROCm option first, as it seems like the onnxruntime-directml
solution is not that well established yet.
The Linux + ROCm option will be the best option if you are familiar with the Linux environment. DirectML is the secondary for who is not.
The Linux + ROCm option will be the best option if you are familiar with the Linux environment. DirectML is the secondary for who is not.
I have a doubt the ROCm setup could be possible with WSL? I have been reading that it can access GPU but I don't know if it is compatible with ROCm or someone is working on that https://learn.microsoft.com/en-us/windows/wsl/tutorials/gpu-compute
ROCm on WSL can not find GPU. I think it accesses GPU as a virtual device? (like VM)
Is there an existing issue for this?
What happened?
I launch the webui from the webui-user.bat and it starts up normally except I notice once the webui is open in my browser my VRAM is filled about to 5GB out of my 8GB. It seems to be happening when it is trying to load the model. So when i try to generate a image it runs out of VRAM in the middle and the generation is stopped. This even happens on a fresh install of the webui where it uses the default model.
Also after trying to generate the image unsuccessfully the VRAM is not released at all. My VRAM is completely full until I shutdown the program.
Another issue is that in the webui I am unable to change which model the generator uses. I pick another model and it shows it processing it but it fails to change and also it takes up my VRAM and does not release any VRAM when it finishes.
Steps to reproduce the problem
What should have happened?
I should have enough VRAM to generate a 512x512 image without running out of VRAM. My VRAM should be clear if I'm not generating anything.
Sysinfo
sysinfo-2023-09-08-20-11.txt
What browsers do you use to access the UI ?
Mozilla Firefox
Console logs
Additional information
No response