Open RandomPrototypes opened 2 years ago
@RandomPrototypes , Same Issue, Have you found any solution to this?
On a similar boat. I’ve a class which loads model once and runs inference by calling session->run
. However in this process I call createTensor
each time before run. Any help with an example of how to bind IO for non-dynamic and dynamic input/output would be greatly appreciated
Hello, I'm trying bind some output values to CUDA to avoid copying back to CPU.
I want to do the c++ equivalent of
and in the following frames, bind the input to the value from the output of the previous frame.
I guess I should do it by the function BindOutput and configure memoryInfo to use CUDA memory.
But I'm not sure how to configure memoryInfo to use CUDA memory because I couldn't find any example doing it.
By searching on other issues, I saw some code saying it's not possible (unless added recently)
but also saw some code that seems to do it :
Due to the lack of documentation, I could not be sure if this second code is really setting for CUDA memory without CPU copy or not. Does the parameter name ("Cuda") corresponds to the type of memory or is it just a name we give to the memoryInfo object without effect? Does OrtDeviceAllocator means GPU/CUDA memory? Is OrtMemTypeDefault the good value?
Can anyone confirm it?
Many thanks