Open hiro28844 opened 2 weeks ago
Hi @hiro28844, does it remain constant or grow? We do pre-allocate memory for the KV-cache to improve performance
Hi @natke ,
does it remain constant or grow?
It continues to grow. Attach a video when I run the script above.
https://github.com/user-attachments/assets/b03ca877-4933-4a03-ab33-aa861cab0fcc
Confirm memory leak in Phi-3-vision. Probably this is related to the following issue: https://github.com/microsoft/onnxruntime-genai/issues/590 But fixing 'max_length' parameter suggested by @PatriceVignola does not change anything for me. Probably the image-text processing is somewhat different from text alone. I am using cuda onnx-genai version. The GPU memory remains constant but CPU memory increases every iteration
Describe the bug When calling the Phi-3-vision multimodal processor, a memory leak appears to occur, causing memory usage to continuously increase.
To Reproduce Run the following script:
Expected behavior Memory usage remains constant no matter how many times the multimodal processor is called.
Desktop (please complete the following information):
onnxruntime-genai==0.4.0