axodox / axodox-machinelearning

This repository contains a pure C++ ONNX implementation of multiple offline AI models, such as StableDiffusion (1.5 and XL), ControlNet, Midas, HED and OpenPose.
MIT License
601 stars 34 forks source link

Access GPU data of result #29

Closed syjeon121 closed 3 months ago

syjeon121 commented 3 months ago

@axodox hi, thank you for your work I would like to use the GPU data of result image continuously without download to CPU. How to access GPU data of txt2img result?

axodox commented 3 months ago

I have not tried it before, but the GetD3D12ResourceFromAllocation method of the OrtDmlApi class should allow you to get the underlying ID3D12Resource from an OrtValue. You also have methods to provide inputs as GPU resources. The drawback is since it is an execution provider specific API, you would closely couple your code to DML.

Since some operators in DML are CPU based, I am also not sure that it will bring the speed up you expect. I think for speeding up DML first we would have to increase the parallelism of the graph executor in it. As for me it seems that due to aggressive memory fencing only one node might execute at a time, reducing compute performance, where some of the GPU might go under utilized.

Edit: note that currently I am using my own fork of ONNX / DML, where I have added significant performance improvements by decreasing and better managing the VRAM in the DirectML execution provider.

syjeon121 commented 3 months ago

thank you for your advice. i will try :smile: