-
Hi,
I see you have built an example for Mistral models that I could build successfully. However, when I try to benchmark such models using GPTSessionBenchmark I get errors like:
`[TensorRT-LLM][ERR…
-
# Prerequisites
Please answer the following questions for yourself before submitting an issue.
- [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…
-
### System Info / 系統信息
transformers: 4.44.0
llama.cpp: latest
Hi, when I try to make a gguf I get this error:
> Traceback (most recent call last):
File "/home/david/llm/llama.cpp/convert…
-
# Expected Behavior
Once, I set the necessary environment variables (export LLAMA_CPP_LIB = /most recent build/libllama.so), the code executes without any error.
# Current Behavior
I created a…
-
```
Collecting FlashRank==0.2.5 (from -r requirements.txt (line 35))
Using cached FlashRank-0.2.5-py3-none-any.whl.metadata (11 kB)
```
```
NFO: pip is looking at multiple versions of flashra…
-
### What happened?
I converted the CodeLlama-7B-instruction model to GGUF format using llama.cpp, but encountered issues with model output when loading the converted GGUF file. The model outputs tex…
-
- [x] Use `llama_decode` instead of deprecated `llama_eval` in `Llama` class
- [ ] Implement batched inference support for `generate` and `create_completion` methods in `Llama` class
- [ ] Add suppo…
-
lava-cli.dir\linkLibs.rsp
C:\w64devkit\bin/ld.exe: C:/w64devkit/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/lib/../lib/libpthread.a(libwinpthread_la-thread.o):thread…
-
```
Starting LOLLMS Web UI...
___ ___ ___ ___ ___ ___
/\__\ /\ \ /\__\ /\__\ /\__\ /\ \
/:/ / /::\…
-
## Describe the bug
When I run this command:
```bash
cargo run --bin mistralrs-server --release --features "cuda" -- -i gguf -m /external/bradley/llama.cpp/models -f llama-31-70B-Q4-K-M.gguf
`…