localcudacluster with onnxruntime inference

Hi,

I am working on using a trained deep-learning model for image denoising. The model is saved in onnx format and I successfully deployed this model with onnxruntime. The workflow is:

convert numpy array to cupy array
do some preprocessing on cupy array
create the onnxruntime session with gpu support
run the model inference with input and out binding to cupy array
do some afterprocessing on cupy array
convert cupy array back to numpy array

Since I have many images to denoise and a signal-node-multi-gpu machine, I wrap the above workflow to one function and I want to use dask-cuda to automatically distribute these tasks. However, the worker always died unreasonably.

I did one test on other cupy-only processing workflow and it works. But with onnxruntime, it never works. I would appreciate it if anybody can help!

Thanks!

rapidsai / dask-cuda

localcudacluster with onnxruntime inference #1399