llama-cpp Search Results

1000+ results
for llama-cpp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

QwenLM/qwen.cpp #69

[BUG] Qwen-1.8-Chat，用llama.cpp量化为f16，然后推理回答错乱，请问1.8在llama.cp…

### 是否已有关于该错误的issue或讨论？ | Is there an existing issue / discussion for this? - [x] 我已经搜索过已有的issues和讨论 | I have searched the existing issues / discussions ### 该问题是否在FAQ中有解答？ | Is there an existing…

Lyzin updated 2 months ago
5
ollama/ollama #6493

Scheduler should respect main_gpu on multi-gpu setup

### What is the issue? The main_gpu option is not working as expected. My system has two GPUs. I've sent the request to `/api/chat` ``` { "model": "llama3.1:8b-instruct-q8_0", "message…

henryclw updated 2 months ago
3
dottxt-ai/outlines #1104

Numpy > 1.26.4

### What behavior of the library made you think about the improvement? This issue is just meant as a Q&A, as I couldnt find anything specifically on this. The question is why there is a dependenc…

Mikeriess updated 2 months ago
4
ymcui/Chinese-LLaMA-Alpaca-3 #107

在设置了相同seed的情况下，多次运行完全相同的脚本得到的输出不同

### 提交前必须检查以下项目 - [X] 请确保使用的是仓库最新代码（git pull） - [X] 已阅读[项目文档](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki)和[FAQ章节](https://github.com/ymcui/Chinese-LLaMA-Alpaca-3/wiki/常见问题)并且已在Issue中对问题进行了搜…

dreamingshao updated 4 weeks ago
1
abetlen/llama-cpp-python #954

Add n_keep parameter to LLama constructor to enable Streamin…

A recent [paper](https://arxiv.org/pdf/2309.17453.pdf) by Meta/MIT/CMU proposed [StreamingLLM](https://github.com/mit-han-lab/streaming-llm/), a simple yet efficient solution to enable "infinite" cont…

twoletters updated 9 months ago
1
intel/neural-speed #174

Performance Gap between Neural Speed Matmul Operator and Lla…

I’ve discovered a performance gap between the Neural Speed Matmul operator and the Llama.cpp operator in the Neural-Speed repository. This issue was identified while running a benchmark with the ONNXR…

aciddelgado updated 5 months ago
13
abetlen/llama-cpp-python #1616

Build fail for version from 0.2.80 to 0.2.83

I installed ```llama-cpp-python``` on the system with: **CPU AMD EPYC 7542** **GPU V100** But it raised the exception shown in the image below:

congson1293 updated 2 months ago
2
tinyBigGAMES/LMEngine #8

Doesn't work with Llama 3.1

Does not start with the Llama 3.1 model. Is it possible to make changes to work with Llama 3.1? This is now the model with the most tokens and will potentially be used everywhere.

avitos updated 3 weeks ago
8
abetlen/llama-cpp-python #473

[BUG] CUDA compile error on arm64 architecture

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…

djkcyl updated 1 year ago
2
THUDM/LongCite #10

convert to guff and ollama

I want to deploy it via ollama, so I firstly convert it to .guff file by llama.cpp's convert_hf_to_guff.py,but I got an error that KeyError "",so I found it not in added_tokens_decoder of tokenizer_c…

Aniwine updated 1 month ago
6

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for llama-cpp

1000+ results
for llama-cpp