llm-cpu Search Results - Githubissues

1000+ results
for llm-cpu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

NVIDIA/TensorRT-LLM #1861

trtllm-build on GRID vGPU - nvml errors

### System Info - Host: VMware ESXi 7 - Host Nvidia drivers: 550.54.16 - VM CPU architecture: x86_64 - VM Nvidia drivers: 550.54.15 - VM OS: Ubuntu LTS 22.04 - Physical GPU: A100 - TensorRT-LLM…

edesalve updated 4 days ago
6
ggerganov/llama.cpp #8365

Bug: Build llama.cpp with DGGML_VULKAN=ON on Ubuntu 20.04 of…

### What happened? command of compilation: ` cmake .. -DGGML_VULKAN=ON -DCMAKE_BUILD_TYPE=Release` command of running: `./bin/llama-cli -m ggml-model-q4_k.gguf -c 512 -b 1024 -n 256 --keep 48 -…

warren-lei updated 7 hours ago
1
intel-analytics/ipex-llm #11392

non-singleton dimension errors when run Deepspeed-AutoTP

HOST安装的步骤 conda create -n llm python=3.11 conda activate llm # below command will install intel_extension_for_pytorch==2.1.10+xpu as default pip install --pre --upgrade ipex-llm[xpu] --extra-index…

jianweimama updated 1 week ago
4
ggerganov/llama.cpp #7877

Bug: GGML_ASSERT: ggml.c:12793: ne2 == ne02 zsh: abort …

During the process of fine-tuning LLama3 using LLama.cpp on my Mac, I encountered this error. I'm a beginner and don't know what caused this issue. I hope an expert can help me. The model used is: …

CodeBobobo updated 1 week ago
1
ollama/ollama #2564

Ollama crashes on CUBLAS_STATUS_NOT_SUPPORTED While loading …

I just upgraded to the latest ollama to verify the issue and it it still present on my hardware I am running version 0.1.25 and trying to run the falcon model Warning: could not connect to a ru…

keesj-riscure updated 16 hours ago
4
intel-analytics/ipex-llm #11424

vLLM freezes with gpu-memory-utilization > 0.55

Running vllm according to instructions. Docker segfaults at startup, so I'm running straight on the machine. Starting server with the following shell script. As you can see I've tried to turn max…

nathanodle updated 1 week ago
3
ollama/ollama #5123

"/api/generate"or "/api/chat always on 7m20s

### What is the issue? Hi I have a little problem, ill try to run the model that i downloaded, but it not start. I try on many ways: ollama run qwen2:72b-instruct --verbose also I try with:…

srchong updated 2 weeks ago
1
LostRuins/koboldcpp #906

--quantkv error with Metal

I'm getting this error when using --quantkv with Metal. ``` GGML_ASSERT: ggml-metal.m:924: !"unsupported op" ``` >python3.11 koboldcpp.py Mistral-7B-Instruct-v0.3-Q8_0.gguf --nommap --flashattenti…

Azirine updated 1 month ago
1
NVIDIA/TensorRT-LLM #1874

Question regarding the weird operation of GPT-J 6B's XQA on …

Hello TensorRT-LLM experts! I have a question regarding the weird operation of the XQA kernel function supported in NVIDIA's official MLPerf 4.0 version of TensorRT-LLM. First of all, I want to te…

bongwonjang updated 6 days ago
2
ggerganov/llama.cpp #8112

Bug: [RPC] RPC apparently isn't honoring backend memory capa…

### What happened? I'm in the process of experimenting with RPC using a fresh builds from ~today and I'm seeing some things that appear at first sight to be bugs and also perhaps just lacking suppo…

ghchris2021 updated 1 week ago
3

上一页 1...4 5 6 7 8 9 10...100 下一页

1000+ results for llm-cpu

1000+ results
for llm-cpu