llm-cpu Search Results - Githubissues

1000+ results
for llm-cpu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

ggerganov/llama.cpp #7643

Bug: cant finetune

### What happened? `GGML_ASSERT: D:\a\llama.cpp\llama.cpp\ggml.c:12853: ne2 == ne02` ### Name and Version ``` version: 2965 (03d8900e) built with MSVC 19.39.33523.0 for x64 ``` ### What operati…

cabfile updated 1 day ago
15
NVIDIA/TensorRT-LLM #1670

nccl reports 'out of memory' when deploy llama3 to triton on…

### System Info -CPU: x86 - Memory: over 300G - GPU: 8 x V100 - No IB, No nvlink, NCCL use socket for communication Driver: ``` +--------------------------------------------------------------…

forrestjgq updated 3 weeks ago
1
NVIDIA/TensorRT-LLM #1868

llama2 runs normally only on adjacent gpus

### System Info tensorrt-llm version 0.11.0.dev2024062500 Architecture: x86_64 AMD EPYC 9354 32-Core Processor ``` txt +----------------------------------------------------------…

janpetrov updated 6 days ago
3
NVIDIA/TensorRT-LLM #1833

Failed to run convert_checkpoint.py with int8 weight-only qu…

### System Info CPU Architecture: x86_64 CPU/Host memory size: 1024Gi (1.0Ti) GPU properties: GPU name: NVIDIA GeForce RTX 4090 GPU mem size: 24Gb…

frontword updated 4 days ago
10
alibaba/FederatedScope #775

GPU Memory Issue

The gpu memory usage continues to increase after each round while finetuning LLM with an adapter. The gpu memory increment after each round was approximately the same. I speculate it's because that th…

stringing updated 1 week ago
10
ollama/ollama #5395

CUBLAS_STATUS_ALLOC_FAILED with deepseek-coder-v2:16b

### What is the issue? when running deepseek-coder-v2:16b on NVIDIA GeForce RTX 3080 Laptop GPU, I have this crash report: ``` Error: llama runner process has terminated: signal: aborted (core dump…

hgourvest updated 5 days ago
6
vllm-project/vllm #5537

[Bug]: CUDA illegal memory access error when `enable_prefix_…

### Your current environment ```text The output of `python collect_env.py` PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

mpoemsl updated 2 weeks ago
8
snexus/llm-search #112

GPU not being used during inference

Version affected: current version v0.7.1 (main) I initially assumed the issue was with my system, outdated nvidia drivers, cuda etc. But after trying on 4 separate machines running different mixes …

StevenK-vz updated 3 days ago
2
ollama/ollama #4604

Ollama Docker - Failing to using GPU after idle time

### What is the issue? ~$ nvidia-smi Fri May 24 09:41:47 2024 +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.171.04 …

chakri-corp updated 6 days ago
10
ggerganov/llama.cpp #8294

Bug: [SYCL] Inference not working correctly on multiple GPUs

### What happened? I am using Llama.cpp + SYCL to perform inference on a multiple GPU server. However, I get a Segmentation Fault when using multiple GPUs. The same model can produce inference output…

ch1y0q updated 4 days ago
1

上一页 1...2 3 4 5 6 7 8...100 下一页

1000+ results for llm-cpu

1000+ results
for llm-cpu