Closed mmkzer0 closed 2 years ago
Different models and implementations have different requirements. In it's current state, cugan does have tiling added, so you can just adjust the tiling size. I did hear that 8gb vram was not enough for higher resolutions for one person though, so your gpu might not even be sufficient and you would need to accept it. I was mainly testing with 16gb vram.
Thank you for the explanation. I will try adjusting the tiling size and report back on my findings.
If I may ask; if the amount of VRAM my GPU has is the problem on CUDA Models, why does TensorRT work?
The architectures cugan, esrgan and compact are fundamentally different. At least cugan is a bit known to eat vram. Not only the backend plays a role in vram usage, but what the model actually does is also important. Some architectures require more recources than others.
TensorRT also does certain stuff under the hood to try to avoid out of memory. TensoRT backends with the python apis do have sizes hardcoded or use very small models, which should be sufficient for 8gb vram usage, while with the C++ api the tool trtexec tries to avoid problems and can do that to some degree during model creation. I have 3 different TensorRT apis in my code, so that's the best summary I can do.
I guess I can close this issue, since this is just a hardware limitation.
System Specs: Ryzen 9 5900HX, NVidia 3070 Mobile, Arch Linux (EndeavorOS) on Kernel 5.17.2
Whenever I try to run a model that is relying on CUDA, for example cugan, the program exits with
and stops after having output 4 frames.
However, TensorRT works fine for models that support it (like RealESRGAN for example).
Edit: Running nvidia-smi while the command is executed reveals that vspipe is allocating GPU Memory, but <2 GiB of VRAM, far from the 8GiB my model has.