-
### Describe the issue as clearly as possible:
`examples/llamcpp_example.py` is broken
It seems like the model is producing some garbage output (which shouldnt be allowed by the logit processor). …
-
### What behavior of the library made you think about the improvement?
I need to install torch, transformers, accelerate etc. even if I want to use outlines only with llamacpp backend.
Are these d…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the LangChain.js documentation with the integrated search.
- [X] I used the GitHub search to find a …
-
### 起始日期 | Start Date
_No response_
### 实现PR | Implementation PR
在 https://huggingface.co/openbmb/MiniCPM-V-2_6-gguf
这里指定的量化运行,需要指定的参数包括:
./llama-minicpmv-cli -m ../MiniCPM-V-2_6/model/ggml-mode…
-
The docs currently use `transformers` all over the place, which is very difficult to run if you do not have a massive GPU. I'd like to convert all the examples to use `llamacpp` or a quantized version…
-
Hi,
I am unable to import LlamaCpp in IPEX
CODE : from ipex_llm.langchain.llms import LlamaCpp
ERROR
Cell In[5], [line 1](vscode-notebook-cell:?execution_count=5&line=1)
----> [1](vscode-note…
-
I was trying to find where to set which quantisation to use for the K/V context cache and it seems you can't in LM Studio.
K/V cache quantisation is required to run models context efficiently by re…
-
I've tried using llamacpp in both docker and native versions using the provided guides:
https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md
https://g…
-
**The bug**
When using `models.LlamaCpp` the selected tokenizer is always gpt2 (This can be seen in the outut when `verbose=True` arg is set). I have pasted the dumped KV metadat keys
```
llama_mod…
-
### Describe the bug
Not sure if this is a widespread issue, but as @osanseviero reported, sharing in https://huggingface.co/spaces/gokaygokay/Gemma-2-llamacpp is broken.
> I tried https://hugging…