-
### What happened?
when i use Llama3-8B-Chinese-Chat-f16-v2_1.gguf to run llama.cpp, here is a crash:
here is my cmd:
./llama-cli -m /home/c00662745/llama3/llama3/llama3_chinese_gguf/Llama3-8B-Chi…
-
# Prerequisites
When I install via pip install llama-cpp-python, there will be an error. It will occur on versions 0.2.81 and 0.2.80, The version 0.2.79 can be successfully installed.
python 3.11…
-
The GPU memory consumption of the model was too high, so I converted it to a LLAMA.CPP file. The GPU memory usage is fine.
However, due to the nature of the model converted to llama.cpp in the model …
-
Hi,
Thanks for creating this wonderful package!
The [save_to_gguf](https://github.com/unslothai/unsloth/blob/2f2b478868f63b66aaaa93db66ab3d811cddc95e/unsloth/save.py#L867) currently fails because …
knc6 updated
2 months ago
-
**Is your feature request related to a problem? Please describe.**
Today, the [llama-cpp/llama_chat_format.py] contains 25 chat format, and 4 chat_completion_handler, this currently force the diffe…
-
Right now we call llama.cpp directly, long-term we should go with either llama.cpp directly or llama-cpp-python. Because maintaining two different llama.cpp backends isn't ideal, they will never be in…
-
# Expected Behavior
Installing the pre-built v0.2.85 Metal wheels works for all supported versions of Python.
# Current Behavior
Installing the pre-built v0.2.85 Metal wheels for Python {3.10…
-
The llama.cpp project already has an option to add `-pg` option with `LLAMA_GPROF=1`.
But it gets crashed when `llama-cli` is traced with uftrace as follows.
```
$ git clone https://github.com/gg…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
### What happened?
To some extent this file has changed to 'convert_hf_to_gguf.py' and it needs to be this : python3 llama.cpp/python llama.cpp/convert.py -h -h and make sure it's python3.
@tobi @…