Memory leak when using MLX models

lmstudio-ai / mlx-engine

👾🍎 Apple MLX engine for LM Studio

MIT License

206 stars 20 forks source link

Memory leak when using MLX models #36

Open Me1000 opened 4 days ago

Me1000 commented 4 days ago

When I load Qwen Coder 2.5 32B Q4 MLX (8k context) it uses about 17.3GB of RAM. After a while it's consuming over 40GB. LM Studio doesn't report the memory usage ever going down until I eject the model and reload it. Then it goes back to 17GB.

This is happening on LMStudio 0.3.5 the v0.0.14 of the mlx runtime on an M4 Max MacBook Pro.

Please let me know if there is anything else you need from me to debug this issue.

neilmehta24 commented 4 days ago

Hello, thanks for the report. This will be fixed in the next version of LM Studio.