Open SimonRelu opened 1 year ago
Update:
It is possible to use an OrtValue in cupy using the data pointer like this:
import cupy as cp
x = cp.random.rand(1, 2, 3)
mem = cp.cuda.UnownedMemory(x.data_ptr(), x.__sizeof__(), owner=x)
memptr = cp.cuda.MemoryPointer(mem, offset=0)
arr = cp.ndarray(x.shape(), dtype=cp.float16, memptr=memptr)
However, it is not possible as far as I know to go from a cupy ndarray to an ortvalue without doing a copy of the data
Describe the feature request
Currently, the DLPack protocol can only be used in non-training build. See:
I believe it would make sense to enable this in the main build and not only the training one. Many AI modules already support this:
Having DLPack support in onnxruntime allows us to have "zero" cost copys between these modules. This is not only interesting during training. Often, multiple models are used in which case the ouput of one model will be used as the input of the next one. When we want to do postprocessing/preprocessing on these models we currently can't do this without moving them to CPU using
.numpy()
. This comes with a significant performce cost.Describe scenario use case
We want to use cupy for processing our model inbetween different inference runs. Cupy supports the DLPack protocol which would allow us to do so. One option would be to build with training support but this makes our package size quite a bit bigger which I'd like to avoid.