llama-cpp Search Results

1000+ results
for llama-cpp

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

BAAI-DCAI/Bunny #122

Bunny v1.1 Llama 3 8b GGUF support/release?

Hey all, thanks for your work. It seems there's an official gguf release for the 1.0 version of the Lllama-based model, but not of the 1.1 version. Is that because llama-cpp changes would be required?…

ghost updated 1 month ago
1
unslothai/unsloth #477

Unable to use fine-tuned Llama 3 model on CPU

Hello, I have fine-tuned a Llama 3 model and now I would love to use it on a CPU. I tried to use `device_map = 'cpu'` when loading the model. However, I am still encountering CUDA issues such as …

code-ksu updated 2 weeks ago
8
ollama/ollama #6045

Documentation for API Options

When reading the [API Docs](https://github.com/ollama/ollama/blob/main/docs/api.md#request-7) many options are listed with no visible explanation for what they do. The only explanation I could find [h…

noggynoggy updated 2 months ago
1
likelovewant/ROCmLibs-for-gfx1103-AMD780M-APU #11

(LM Studio)Failed to load LLM engine from path (6750XT/Windo…

Tried following the wiki https://github.com/likelovewant/ROCmLibs-for-gfx1103-AMD780M-APU/wiki/Unlock-LM-Studio-on-Any-AMD-GPU-with-ROCm-Guide#using-amd-graphics-cards-with-lm-studio Copied the fi…

kaattaalan updated 1 hour ago
2
containers/podman-desktop-extension-ai-lab #55

x86 cuda model_service image does not run

When running the x86 model_service image you face this error ``` Traceback (most recent call last): File "/opt/app-root/lib64/python3.9/site-packages/llama_cpp/llama_cpp.py", line 74, in _load…

lstocchi updated 8 months ago
1
janhq/cortex.cpp #323

engine: AMD GPU support

## Overview ## Tasklist - [ ] Can this be solved via llama.cpp? (e.g. compiled for Vulkan and ROCm) - [x] https://github.com/janhq/cortex.llamacpp/issues/9 - [ ] [https://github.com/janhq/jan/issues…

hiento09 updated 3 days ago
16
xorbitsai/inference #1951

Could not download qwen2-moe-instruct q4_k_m automatically

### System Info / 系統信息 Ubuntu 20.04 ### Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？ - [X] docker / docker - [ ] pip install / 通过 pip install 安装 - [ ] installation from source / 从源码…

Tint0ri updated 4 weeks ago
4
pytorch/torchtune #1652

[Bug] Unusual CPU overhead of SDPA call on H100 on torch nig…

**Issue identified:** cuDNN SDPA JIT recompiles when the context length changes. This results in training that does not use packing to keep recompiling, resulting in the observed 500ms overhead. --…

Jackmin801 updated 6 hours ago
9
NVIDIA/TensorRT-LLM #1805

How to test the time to new token of a model in Tensorrt-llm

I found that in the benchmark/suite has the output time to first token. However, when I run `python benchmark.py --model meta-llama/Llama-2-7b-hf static --isl 128 --osl 128 --batch 1` an error occurs:…

Ourspolaire1 updated 2 months ago
8
ollama/ollama #3910

Add OpenELM

Apple released several open source LLMs that are designed to run on-device. [Huggingface Link](https://huggingface.co/apple/OpenELM)

3Samourai updated 1 month ago
15

上一页 1...74 75 76 77 78 79 80...100 下一页

1000+ results for llama-cpp

1000+ results
for llama-cpp