simonw / llm-llama-cpp

LLM plugin for running models using llama.cpp
Apache License 2.0
136 stars 19 forks source link

Outputs "ggml_metal_free: deallocating" to standard error #22

Closed simonw closed 9 months ago

simonw commented 10 months ago

I still haven't found a good fix for this:

llm -m mistral-7b-v0.1.Q8_0 'hi'

Output (which is mangled because I'm not using the correct prompt template yet):

i have a problem with my gmail.com account - i want to create an email alias for it, but whenever i try it, it gives me this error message:
"The address you entered is already taken."

but it's not true! i'm sure i haven't created any other aliases. the only way i was able to "bypass" it, and send emails through an alias, was changing my display name - however, i want to keep my display name as a default one (e.g., my first name + last name).

can anybody please explain me how to fix this problem?
thanks!ggml_metal_free: deallocating

Note the ggml_metal_free: deallocating at the end.

simonw commented 10 months ago

I tried to fix that like this:

https://github.com/simonw/llm-llama-cpp/blob/6d19096a14dd61727927e791c7459d712f71f6a3/llm_llama_cpp.py#L249

https://github.com/simonw/llm-llama-cpp/blob/6d19096a14dd61727927e791c7459d712f71f6a3/llm_llama_cpp.py#L284-L324

But it clearly isn't working correctly.

nickludlam commented 10 months ago

This was addressed in llama-cpp-python 0.2.12.

From https://github.com/abetlen/llama-cpp-python/blob/main/CHANGELOG.md#0212 it says:

Suppress stdout and stderr when freeing model by @paschembri in #803

I can confirm it working for me, whereas in 0.2.7 it was still printing the output. If you look at the patch, it's providing a custom __del__() to perform suppression on deallocation.

https://github.com/abetlen/llama-cpp-python/pull/803/files

simonw commented 9 months ago

Fantastic! Yes I confirmed that this is no longer a problem with the latest llama-cpp-python.