llm-cpu Search Results - Githubissues

1000+ results
for llm-cpu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ollama/ollama #6857

Issues getting rocm support to compile on Gentoo

### What is the issue? I'm trying to get the project to compile on Gentoo but am running into some issues as Gentoo uses different paths. On Gentoo, rocm libraries get installed into /usr/lib64, h…

Roger-Roger-debug updated 3 days ago
13
NVIDIA/TensorRT-LLM #2172

stack during import tensorrt_llm

### System Info - CPU x86_64 - GPU: L40 - tensorrt_llm: 0.11.0 - CUDA: 12.4 - driver: 535.129.03 - OS: CentOS 7 ### Who can help? When I tried to import tensorrt_llm, it got stuck. Through debuggi…

Howe-Young updated 2 days ago
3
collabora/WhisperLive #277

TensorRT Segmentation fault

I'm trying to run the TensorRT version of the docker container according to instructions, but am getting a segfault whenever I attempt to transcribe any audio. The same audio works with the Faster whi…

matuszelenak updated 1 week ago
6
ggerganov/llama.cpp #9628

Bug: Failed to run qwen2-57b-a14b-instruct-fp16.

### What happened? I am trying to run Qwen2-57B-A14B-instruct, and I used llama-gguf-split to merge the gguf files from [Qwen/Qwen2-57B-A14B-Instruct-GGUF](https://huggingface.co/Qwen/Qwen2-57B-A14B-…

tang-t21 updated 1 week ago
3
ggerganov/llama.cpp #9556

Bug: llama cpp server arg LLAMA_ARG_N_GPU_LAYERS doesn't fol…

### What happened? If creating a llama model in python code, you can specific n_gpu_layers=-1 so that all layers are offloaded to GPU. (see below example) When starting llama cpp server using the doc…

mvonpohle updated 2 weeks ago
2
ggerganov/llama.cpp #9472

Bug: [SYCL] Error loading models larger than Q4

### What happened? After building the SYCL server image, trying to load a model larger than Q4 on my Arc A770 fails with a memory error. Anything below Q4 will execute, but this is due to the "llm_l…

HumerousGorgon updated 3 weeks ago
4
vllm-project/vllm #7722

[Bug]: Using CPU for inference, an error occurred. [Engine i…

### I compiled the vllm0.5.4 using the CPU, which does not support AVX512. After compiling, I entered the container and executed the command to start the llama3-8b model. ```text python3 -m…

liuzhipengchd updated 1 month ago
4
mlc-ai/mlc-llm #2893

[Bug report] running on wsl2 also on windows?

Hello, how to run mlc llm on wsl2 using cpu? I was try mlc_llmchatHF://mlc-ai/Llama-3-8B-Instruct-q4f16_1-MLC But get error. Please provide me with a command that I can copy the error text with my mo…

BlindDeveloper updated 3 weeks ago
5
intel-analytics/ipex-llm #12108

(PI_ERROR_BUILD_PROGRAM_FAILURE)Exception caught at file:C:/…

Hi, I am having an issue with running the sample example in the [quickstart guide](https://github.com/intel-analytics/ipex-llm/blob/main/docs/mddocs/Quickstart/llama_cpp_quickstart.md#3-example-runnin…

aalinkil updated 1 week ago
4
vllm-project/vllm #8372

[Bug]: OOM when running llama3.1-8B-Instruct

### Your current environment Hello, I'm trying to download llama3.1-8B-Instruct to my PC and each time i try, i get the following error: ```bash [rank0]: torch.OutOfMemoryError: CUDA out of memo…

UrielShapiro updated 3 weeks ago
1

上一页 1...5 6 7 8 9 10 11...100 下一页

1000+ results for llm-cpu

1000+ results
for llm-cpu