llm-cpu Search Results - Githubissues

1000+ results
for llm-cpu

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

Dao-AILab/flash-attention #595

Might be a solution to get built/compiles Flash Attention 2 …

As a Windows user, I tried to compile this and found the problem was on these two files "```flash_fwd_launch_template.h```" and "```flash_bwd_launch_template.h```". below "```./flash-attention/csrc/fl…

Akatsuki030 updated 3 days ago
49
vllm-project/vllm #5360

[Bug]: Multi GPU setup for VLLM in Openshift still does not …

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

jayteaftw updated 1 month ago
17
explosion/spaCy #13132

Spacy-LLM code sample produces no output

Hi, The code sample below - which is based on an example in Matthew Honnibal's blog "Against LLM maximalism" (https://explosion.ai/blog/against-llm-maximalism) - fails to produce any output. This i…

rkatriel updated 4 months ago
16
vllm-project/vllm #3012

Unable to specify GPU usage in VLLM code

I am facing difficulties in specifying GPU usage for different models for LLM inference pipeline using vLLM. Specifically, I have 4 RTX 4090 GPUs available, and I aim to run a LLM with a size of 42GB …

humza-sami updated 2 months ago
15
intel-analytics/ipex-llm-tutorial #73

Python performance is too poor, can it provide an inference …

Using bigdl-llm in a production environment, Python performance is too poor, can you provide an inference library in C++ and provide an OpenAI-compatible API

geffzhang updated 5 months ago
3
ollama/ollama #5604

Error while running mixtral 8x7b q6 with 3x 7900 XTX

### What is the issue? It was working fine with 2x 7900 XTX but after I added a new graphic card the output it just like this ![imagen](https://github.com/ollama/ollama/assets/118543481/cb8024…

darwinvelez58 updated 1 week ago
2
infiniflow/ragflow #1091

[Question]: The file can not be parsed, has been stuck in pa…

### Describe your problem I select the corresponding knowledge base on the web page, upload multiple PDF files, start one or more file parsing, often appear stuck in the parsing (one or two days do n…

Romanzhang2024 updated 4 days ago
9
mudler/LocalAI #2394

CUDA 12.5 support or GPU acceleration not working after grap…

Hey there, I'm running **LocalAI version:** `docker run --rm -ti --gpus all -p 8080:8080 -e DEBUG=true -v $PWD/models:/models --name local-ai localai/localai:latest-aio-gpu-nvidia-cuda-12 -…

CodeMazeSolver updated 3 hours ago
24
ggerganov/llama.cpp #3772

multi-gpu inference produces broken output

# Prerequisites Please answer the following questions for yourself before submitting an issue. - [x] I am running the latest code. Development is very rapid so there are no tagged versions as of…

nih23 updated 2 months ago
21
huggingface/optimum #1092

LLM or any model cannot be exported to onnx with opset 9

### Feature request Export to onnx fails for opset 9 with T5 ### Motivation ONNX opset 9 is required by SNPE, Qualcomm SDK accelerator. By supporting ONNX opset 9, we will unleash ML on the e…

escorciav updated 12 months ago
14

上一页 1...93 94 95 96 97 98 99...100 下一页

1000+ results for llm-cpu

1000+ results
for llm-cpu