onnxruntime CUDA 12 and improve load time

w-e-w commented 2 months ago

there is some issues with your installation sequence

onnxruntime-gpu, device is on CUDA 12 will have to install onnxruntime-gpu from a diffrent index, otherwise it won't work
it is important to keep install.py as lightweight as possible as install.py is called in a separate subprocess, any import have extra cost, especially import torch for example import torch alone can slow down execution by about 1 second (on my system) it is important to reduce what is done in, it is important to use the cheapest methods and reduce what is done in install.py

I've restructured some of the install.py code and split internal_liveportrait/utils.py in utils and utils_base utils_base the operation necessary used by install.py and restructured so that imports is called at last moment load time comparison

before	after

w-e-w commented 2 months ago

Could you please tell me where you get your screenshots from

...... look at my first screenshot I circle it out for you Startup profile

torch.version.cuda maybe (in terms of performance, I guess it would be OK to use torch there as we would only need it if the correct version of onnxruntime-gpu is not installed) ?

don't quote me on this but if I recall correctly Pytorch uses its own internal CUDA runtime so don't really care about the drivers but this is not the case with onnxruntime https://discuss.pytorch.org/t/install-pytorch-with-cuda-12-1/174294 this is why unless you build from source you shouldn't need to cuda toolkit

assumeing I am correct pytorch should not be used as cuda version as an indicator

note irrc onnxruntima also dont need cuda toolkit to work, it only needs the GPU driver CUDA to match

https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements

note cuda runtime and cuda toolkit are diffrent

to be honest I'm not sure how well this works with WSL I suspect it might add an extra layer of complexity

from my understanding is that if you need to compile CUDA related stuff then you need the CUDA toolkit but if you can rely on pre-compiled binaries then the driver should be sufficient

adding to the confusion I think the driver and the toolkit both contains the runtime I'm not sure which one it will use

I'll repeat again, don't quote me on this, I could be very wrong on the above information

dimitribarbot commented 2 months ago

Could you please tell me where you get your screenshots from

...... look at my first screenshot I circle it out for you Startup profile

😅. Indeed... I never paid attention to this link in the footer, very useful! Thanks!

Concerning my onnxruntime-gpu issue, you're totally right, the problem was linked to how WSL works under the hood. It's working fine in Windows without WSL. To make it work with animal mode which needs CUDA Toolkit 11.8 to build the OP file, and as I have NVIDIA drivers using CUDA 12.6, I needed to have both CUDA 11.8 and CUDA 12.6 runtimes installed! And indeed, CUDA runtime is installed at the same time as CUDA Toolkit. Having both toolkits installed and my CUDA_HOME pointing to the CUDA 11.8 folder, it's working fine.

Consequently, I will merge your PR and add a note in the installation procedure for animal mode for WSL users. Thanks!

dimitribarbot / sd-webui-live-portrait

onnxruntime CUDA 12 and improve load time #2