Add CUDA execution provider and enable TensorRT engine cache

occ-ai / obs-backgroundremoval

An OBS plugin for removing background in portrait images (video), making it easy to replace the background when recording or streaming.

GNU General Public License v2.0

2.68k stars 186 forks source link

Add CUDA execution provider and enable TensorRT engine cache #575

Closed krakowski closed 2 months ago

krakowski commented 2 months ago

Description

This PR adds the following new functionality for Linux based systems:

In addition to the TensorRT Execution Provider, the CUDA Execution Provider can now also be selected via the GUI. The loading process is significantly faster than loading the TensorRT Engine (for the first time), as the model does not have to be optimized beforehand.
The TensorRT Execution Provider now uses a cache so that the model does not have to be recreated every time. This reduces the start time from ~ 50 seconds to ~ 4 seconds (in my case).

I have also made a small change to the build script, allowing users to set the ONNX Runtime version using variables.

Related Issues

Fixes #549

umireon commented 2 months ago

Please remove the CUSTOM_ONNXRUNTIME_VERSION variable. We don't want to support arbitrary combinations of our plugin and ONNX Runtime.

umireon commented 2 months ago

Is the change of cmake/FetchOnnxruntime.cmake needed for the CUDA provider and cache for the TensorRT provider? If this change is not essential, please remove the changes from this PR and include it in another PR.

krakowski commented 2 months ago

Please remove the CUSTOM_ONNXRUNTIME_VERSION variable. We don't want to support arbitrary combinations of our plugin and ONNX Runtime.

Sure, I will remove the commit and force push to my branch :+1:

Some background why I added this:

ONNX Runtime 1.17.1 is compiled against CUDA 11, while my system has CUDA 12 installed. By changing the URL (https://github.com/microsoft/onnxruntime/releases/download/v1.17.3/onnxruntime-linux-x64-gpu-cuda12-1.17.3.tgz) and version (1.17.3), I was able to use the plugin with CUDA 12. The archive's naming convention doesn't seem to be stable, so I couldn't find a way around specifying everything manually.

umireon commented 2 months ago

Since we do not require the contributions to be contained in a single commit, please do not force-push for later reference.

krakowski commented 2 months ago

@umireon I removed the commit :+1:

umireon commented 2 months ago

I suppose the build script will accept the ONNXRUNTIME_PROVIDER variable and choose the corresponding predefined URL rather than accept any ONNX Runtime version.

krakowski commented 2 months ago

I placed the deleted commit in a separate branch for later reference: https://github.com/occ-ai/obs-backgroundremoval/compare/main...krakowski:obs-backgroundremoval:refactor/build

umireon commented 2 months ago

I think cpu, cuda11, and cuda12 providers are available on the official distributions.

umireon commented 2 months ago

Please run ./build-aux/run-clang-format and ./build-aux/run-cmake-format to fix the formatting errors.

umireon commented 2 months ago

I think we first need to update our ONNX Runtime version to be 1.17.3 to merge this PR, is it right?

krakowski commented 2 months ago

I think we first need to update our ONNX Runtime version to be 1.17.3 to merge this PR, right?

This should work with ONNX 1.17.1, too. I changed the version (which is now reverted) because version 1.17.1's CUDA 12 archive (https://github.com/microsoft/onnxruntime/releases/download/v1.17.1/onnxruntime-linux-x64-cuda12-1.17.1.tgz) is missing a shared library.

umireon commented 2 months ago

Sure I will update our ONNX Runtime to 1.17.3 after this PR is merged.

umireon commented 2 months ago

This suggestion will fix the build problem on Mac.

royshil commented 2 months ago

@krakowski thanks for the great work

i'm not asking to do anythign further in this PR by if we're here aleady we may alerady add CUDA support for windows... i mean its just packaging the provider .dll for onnxrt