Closed rmes-ai closed 2 weeks ago
The issue may be caused by lack of RAM. @rmes-ai could you check memory usage and provide more details on it?
Additionally, @rmes-ai can you share the URL of the notebook you are trying? If it is about INT4 weight compression, it is started to be supported in OpenVINO 2023.2, not in 2023.0.2. Thanks!
Thank you @vurusovs & @wenjiew for the prompt support.
Confirming I'm using OpenVINO 2023.2.0 (openvino==2023.2.0 and OpenVINO-dev==2023.2.0).
In terms of memory usage, yes it's using up all my 16 GB of RAM.
I could replicate the process using my other M2 (64 GB RAM) to see if the issue is related to the RAM?
I could replicate the process using my other M2 (64 GB RAM) to see if the issue is related to the RAM?
Yes, it would be great to separate RAM issue from any other functional problems
@alvoron Can you help take a look since this is ARM (MacOS) related? Thanks!
@rmes-ai OpenVINO does not natively support i4/i8 inference on ARM so far, so it's needed to avoid model compression.
My M2 reboots while llava weights compression is in progress.
If I skip compression and do fp16 inference (OpenVINO on ARM supports FP16 precision natively) then I get probability tensor contains either inf, nan or element < 0
which points to accuracy issue.
I was able to run nanoLLaVA notebook using both fp32 and fp16 precision, could you please try this one with the latest OpenVINO release (ARM functionality develops rapidly and it's better to use the latest release)? https://github.com/openvinotoolkit/openvino_notebooks/tree/latest/notebooks/nano-llava-multimodal-chatbot
@rmes-ai please feel free to reopen the issue if you have any other questions related to this topic.
OpenVINO Version
2023.0.2
Operating System
macOS Systems for Apple Silicon
Device used for inference
CPU
Framework
None
Model used
LLaVA
Issue description
I followed the exact procedure recommended by the OpenVINO notebooks repository and encountered a memory leak when attempting the optimization to INT4.
Step-by-step reproduction
I followed the exact procedure recommended by the OpenVINO notebooks repository and encountered a memory leak when attempting the optimization to INT4.
Relevant log output
Issue submission checklist