-
看到提供的demo,对于Android设备在8Gen3的MiniCPM-V2.0,对于MiniCPM-V2.6是否支持在android端部署推理的能力
-
# Feature Description
Please provide a detailed written description of what you were trying to do, and what you expected `llama.cpp` to do as an enhancement.
# Motivation
It sounds like it's …
-
### Feature Description
LlamaCPP is able to cache prompts to a specific file via the "--prompt-cache" flag. I think that exposing this through node-llama-cpp would provide for some techniques for sub…
-
new format:
https://huggingface.co/TheBloke/StableBeluga-13B-GGUF#example-llamacpp-command
old format:
https://huggingface.co/TheBloke/StableBeluga-13B-GGML
- fine-tuned on Llama 2 by follow…
-
I installed llamacpp using the instructions below:
CMAKE_ARGS="-DLLAMA_CUBLAS=on" pip install llama-cpp-python
the speed:
llama_print_timings: eval time = 81.91 ms / 2 runs ( 40…
-
First of all thanks for the great work!
## Context
I was trying to use with my new build 2 p40 in ubuntu 24.04 (Pop_OS), and it seems to run, I had to check the code and found a endpoint with s…
-
This is a tracking issue for us to figure out for the service to process multiple requests in parallel "so users wouldn't notice" and we don't need to heavily invest into multiple GPUs
-
### Environment
🐧 Linux
### System
Debian 12
### Version
1.12.7
### Desktop Information
Node: 22.4.1
Backend: LLamacpp & LM Studio
### Describe the problem
I am trying to set the llamacpp se…
-
I currently have a working setup with llamacpp+mistral 7b instruct with the following loca.env :
```
MODELS=`[
{
"name": "Mistral",
"chatPromptTemplate": "{{#each messages}}{{#ifUse…
-
traceback:
```
FAILED: /home/luna/llama-cpp-torch/build/temp.linux-x86_64-cpython-311/llamacpp_kernel.o
/opt/cuda/bin/nvcc --generate-dependencies-with-compile --dependency-output /home/luna/llama-…