Open guigzzz opened 4 months ago
IOBinding is deprecated.
Unless otherwise instructed, the output OrtValues are created and copied to CPU memory at the end of inferencing.
https://onnxruntime.ai/docs/tutorials/csharp/basic_csharp.html
Can you elaborate on 'IOBinding is deprecated', this is news to me. How else are we supposed to efficiently reuse output OrtValues? If it truly is deprecated, then the documentation should be updated to reflect that.
The 'unless otherwise instructed' part is the crucial bit here. I have my output tensors being allocated on the GPU and I only sometimes want to copy them back to the host (time series model, so outputs feed back into the input, this is more efficient if everything stays on the GPU), but currently can't.
This issue has been automatically marked as stale due to inactivity and will be closed in 30 days if no further activity occurs. If further support is needed, please provide an update and/or more details.
Describe the issue
Hey guys,
I can't seem to figure out an easy way to copy an OrtValue that's been allocated on the GPU, back to the CPU.
OrtValue has a really convenient
GetTensorDataAsSpan
API, which just seems to wrap the raw pointer into a span which obviously won't work when the pointer is for memory on the GPU.The python API has a nice copy_outputs_to_cpu API, which is exactly what I need.
Can we have the same thing added to the dotnet API ? Either the
GetTensorDataAsSpan
API could be updated to do the copying automatically, or a newCopyOutputsToCpu
API could be added to theIOBinding
class, similar to python.To reproduce
Fails with:
Urgency
Not urgent, feature request.
Platform
Linux
OS Version
Ubuntu 20.04
ONNX Runtime Installation
Released Package
ONNX Runtime Version or Commit ID
1.18.0
ONNX Runtime API
C#
Architecture
X64
Execution Provider
CUDA
Execution Provider Library Version
CUDA 12.2