microsoft / onnxruntime

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

https://onnxruntime.ai

MIT License

14.59k stars 2.92k forks source link

[Documentation Request] Java example on I/O binding #17032

Open siyuan1992 opened 1 year ago

siyuan1992 commented 1 year ago

Hi,

Can we add more examples for Java library please? I don't see OnnxRuntime-GPU 1.15 Java library has any IO binding class implemented. Wonder if we need to implement ourselves? Some examples would be good.

Thanks!

Document Details

Title: I/O Binding
Page: https://onnxruntime.ai/docs/performance/tune-performance/iobinding.html
Page Source: iobinding.md

Craigacp commented 1 year ago

The Java API does not implement IOBinding. I've got a few open PRs which will make the initial round of CPU memory pinning work which should provide some of the benefit when not using accelerators, then I plan to look at implementing IOBinding for GPUs.

Craigacp commented 1 year ago

I'm also working on a better example for overall use of the library in Java but that's waiting on some internal reviews before I can make it public.

siyuan1992 commented 1 year ago

Hi @Craigacp, thanks for the reply!

Could you point me to the open PRs? I want to have an understanding on how you plan to implement I/O binding.
Also what' the rough timeline for the features to be added?

Craigacp commented 1 year ago

16578 & #16835 allow for the use of pre-allocated CPU buffers in outputs, and make it easier to reuse a buffer in an input. IOBinding support doesn't have a timeline beyond it being the next thing I'm going to work on, I'm starting discussions with some people in Nvidia and the Java group here in Oracle about how to best access GPU pointers and GPU memory from Java, but we haven't got anything implemented in ORT yet. I would prefer a solution that doesn't require any JNI which needs the CUDA (or AMD) headers to compile as it will make the build process for ORT even more complicated than it currently is.