-
**The bug**
When using `models.LlamaCpp` the selected tokenizer is always gpt2 (This can be seen in the outut when `verbose=True` arg is set). I have pasted the dumped KV metadat keys
```
llama_mod…
-
### Describe the bug
Not sure if this is a widespread issue, but as @osanseviero reported, sharing in https://huggingface.co/spaces/gokaygokay/Gemma-2-llamacpp is broken.
> I tried https://hugging…
-
### Describe the bug
I created a build for Windows x32 and noticed that archchecker.dll only works for x64.
When I run the application, I see an error in Player.log:
_Failed to load library F:/LLM_…
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
Hi,
I attempted to reproduce the simple streaming response functionality using workfl…
-
### Describe the bug
With a fresh install of 1.15, Exllamav2_HF loads a model just fine... However, when I do a local install of exllamav2, then both it and the Exllamav2_HF loaders break ( errors b…
-
Linked to #1217
Given that: The current CI is publishing the binary file and installer to the release, but the binary does not include the llamacpp engine.
Expectation: Jan will pull the Cortex binar…
-
### 起始日期 | Start Date
_No response_
### 实现PR | Implementation PR
_No response_
### 相关Issues | Reference Issues
_No response_
### 摘要 | Summary
Hi, I am trying to load this using Llama.CPP HTTP s…
-
## Goals
Modify engines name:
- onnxruntime
- llama-cpp
- tensorrt-llm
| # | Name | Supported Formats | Version | Status |
|---|---------------|-------------------|---------|----------…
-
Hello, I have a custom cortex engine (compiled into an engine.dll). Right now I am able to make use of it in a super hacky way, by replacing the default cortex.llamacpp engine.dll with my custom versi…
-
**The bug**
Updating from guidance==0.1.16 to prerelease guidance==0.2.0rc1 causes model.log_prob() to return 0 rather than the true log probs for a generation when using the llama.cpp backend. I hav…