Open hashJoe opened 3 months ago
Are you trying to optimize native memory usage, Java memory usage or total memory?
Are you trying to optimize native memory usage, Java memory usage or total memory?
I am interested in total memory: on cpu, Resident Set Size is measured; on gpu, cuda memory + rss are measured
Describe the feature request
I'm currently working with ONNXRuntime for performance-critical applications in Java, and I've found it challenging to optimize memory usage without detailed insights into tensor allocation lifetimes. In TensorFlow, I am accustomed to using the profiler to obtain metadata about tensor allocations, such as allocation/deallocation timestamps and bytes allocated.
More information about tensorflow profiler can be found here: RunMetadata and StepStats
Given the above information, allocation time can be inferred for each tensor.
Is there similar profiling capabilities that allow us to track the lifetime of tensor allocations in ONNXRuntime?
Using SessionOptions#enableProfiling gives no such information.
Describe scenario use case
This information is crucial for identifying bottlenecks and optimizing the memory footprint of models during inference or training.
Such a feature should provide: