janhq / cortex.cpp

Run and customize Local LLMs.
https://cortex.so
Apache License 2.0
1.97k stars 111 forks source link

bug: insufficient handling of insufficient memory #1457

Open jlfranklin opened 2 weeks ago

jlfranklin commented 2 weeks ago

Jan version

0.5.5

Describe the Bug

When there is insufficient memory to run the model, lot of errors are thrown into the logs, and the returned text is complete gibberish.

Jan should stop the model and say something like, "sorry, my brain is full."

Steps to Reproduce

  1. Clean install of Jan on an 8GB Macbook Air M1
  2. Load Qwen Chat 7B
  3. Ask it anything.

Screenshots / Logs

app.log

memory-pressure sample-response

What is your OS?

imtuyethan commented 2 weeks ago

cortex.cpp team is working on this

0xSage commented 5 days ago

Needed: proper error handling when:

  1. user attempts to load a model too big to fit in available memory
  2. error message, e.g.: unable to load model due to insufficient system memory. xx needed. xx available.

Feel free to reassign @vansangpfiev and move to a different sprint