Question 1: If the input size is different, the pre processed model cannot be reused and needs to be re read into the model for processing. At this time, the GPU cache is not released, greatly increasing memory usage. Is there a solution? The result is in the first compressed package.
Question 2: The cache consumed by the GPU is much higher than the CPU, for example, the 1080p image has already experienced OOM when the GPU zooms in twice. The result is in the second compressed package.
OpenVINO Version
tag 2023.0.1
Operating System
Ubuntu 20.04 (LTS)
Device used for inference
iGPU
OpenVINO installation
Build from source
Programming Language
C++
Hardware Architecture
x86 (64 bits)
Model used
BSRN
Model quantization
No
Target Platform
models: model.zip
Performance issue description
Question 1: If the input size is different, the pre processed model cannot be reused and needs to be re read into the model for processing. At this time, the GPU cache is not released, greatly increasing memory usage. Is there a solution? The result is in the first compressed package.
Question 2: The cache consumed by the GPU is much higher than the CPU, for example, the 1080p image has already experienced OOM when the GPU zooms in twice. The result is in the second compressed package.
1020x678.zip
1080p.zip
Step-by-step reproduction
Read in the model, default model size 350x640, modify the model input tensor, and use CPU/GPU for inference.
Issue submission checklist