Closed jaredmontoya closed 2 years ago
The vkQueueSubmit failed error strikes again... Make sure you have the right graphics drivers installed. I don't really know anything about integrated graphics but based on a little googling it seems like this integrated GPU is too old to have vulkan drivers on windows -- but since you're on linux I think you should be able to get it to work. However since it seems like its pretty old, it might just have issues no matter what. Unfortunately, this isn't really an error I can solve as it just seems to be a problem with the official ncnn python bindings (which the xxxxx-ncnn-vulkan projects don't use). I wish I knew what to do to solve this.
thanks for the explanation
Actually, taking a look again and thinking about it more I've made a realization: you definitely can't upscale this as your integrated graphics card uses your system ram, not your vram. So if it's getting to the point where it's using up all your ram that probably means it's just too much for your PC to handle. How big of an image are you trying to upscale?
@joeyballentine Is this an issue that could be solved with more aggressive tiling?
@RunDevelopment if it's really using system ram then no, because we hold onto the tiles in ram. If anything tiling would just make it worse
Actually, taking a look again and thinking about it more I've made a realization: you definitely can't upscale this as your integrated graphics card uses your system ram, not your vram. So if it's getting to the point where it's using up all your ram that probably means it's just too much for your PC to handle. How big of an image are you trying to upscale?
image that I tried to upscale was 320x768, if I assume that when chainner console logs say cpu they means cpu and not integrated gpu, then as I showed in the third screenshot, during upscaling ncnn for some reason uses cpu and for that disables fp16 instead of using integrated gpu as selected in my second screenshot, my gpu may run out of ram with heavier x4plus model but it still can handle x4plus-anime model when ran by realesrgan-ncnn-vulkan project, when I run it, it says fp16 is on in the console, as far as I know cpu can't do fp16, so realesrgan-ncnn-vulkan project is definetely using gpu as intended and does not default to cpu like ncnn runtime in chainner and can run x4plus-anime model on my gpu without any problems. So what I am trying to say is that maybe ncnn in chainner is defaulting to cpu because of broken offical bindings(like you said) or because of some other reason, and usage of cpu instead of gpu is the cause of an error, maybe if it used my gpu and fp16 it would run normally like with realesrgan-ncnn-vulkan -n realesrgan-x4plus-anime, after all fp16 is really good at lowering vram(in my case ram) usage, and that can be the reason why ncnn-vulkan -n realesrgan-x4plus-anime doesn't run out of memory
Sorry, the CPU and fp16 you see in the logs are for pytorch. I keep forgetting to remove those (they log no matter what which is confusing).
NCNN always uses your GPU, as we don't even have code to support CPU upscaling for it. I think the code in realesrgan's CLI is doing something that we aren't, or that the python bindings are not doing. I think @theflyingzamboni found something that was different between them a while back, but we don't really know enough about how ncnn works to properly integrate that.
Also, you still get this error using the RealESRGAN CLI -- does it actually finish processing?
Also, you still get this error using the RealESRGAN CLI -- does it actually finish processing?
realesrgan-ncnn-vulkan has 0 errors and finishes processing when using x4plus-anime model, and as far as I remember it also works with realsr anime video models, but if you are asking about x4plus model that I had troubles with before, then yes it still outputs vkQueueSubmit error as I didn't change or fix anything, I can assume how ncnn works only based on logs and search result, I don't know how it works too because I am too scared to try to understand entire ncnn codebase and especially mess with complex c/c++ code because I am actually really bad at those programming languages, I know only Python, Go and recently started learning Rust.
@HACCKKER would you mind testing the latest release (0.14) and seeing if this is still an issue?
ok, I will try
sad news, same errors in the terminal, VkWaitForFences and vkQueueSubmit errors like before, I used similar by resolution and type image to the one I used before. now the output is always that: logs.zip
also I noticed that now it crashes after cpu load becomes 100% and then rapidly decreases back to what it was before
I'm guessing then that it is still just related to the fact that you are using integrated graphics... I'm afriad there's nothing else I can do at this point
I just wish I knew what RealESRGAN's CLI was doing differently.... Last I checked, we basically do all the exact same things as it. Really the only difference is it uses extremely small tile sizes
I'm guessing then that it is still just related to the fact that you are using integrated graphics... I'm afriad there's nothing else I can do at this point
you are probably right.
Information:
Description Even if gpu is selected for use with ncnn it uses cpu and fails
I used real esrgan anime model because even if i use regular x4plus model how it is described in README.md ./realesrgan-ncnn-vulkan -i inputs -o outputs -n realesrgan-x4plus I always get vkQueueSubmit failed and can only run its pytorch or onnx version on cpu and anime model is working absolutely correct when used in that way ./realesrgan-ncnn-vulkan -i inputs -o outputs -n realesrgan-x4plus-anime
but when I use it in chainner I get vkQueueSubmit failed error in command line even with anime model that runs fine, when I changed tiling settings from auto to 4096 it stopped saying unexpected error in chainner, but it started asking for more extreme tiling, but there is nothing more extreme to set than 4096
I set my integrated gpu as ncnn device
I started chainner from command line to see logs and it says device: cpu and fp16 is set to False, so I think defaulting to cpu can be the root of the problem, if it just says cpu because my gpu is integrated, then I don't know what causes the problem, the only other thing I can think of is that real esrgan's ncnn runtime is modified
Logs Archive.zip