kv-cache-quantization Search Results

1000+ results
for kv-cache-quantization

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

abetlen/llama-cpp-python #1372

CUDA error: invalid device function

# Prerequisites ROCm 6 # Expected Behavior Attempting to utitilize llama_cpp_python in OobaBooga Webui # Current Behavior It loads the model into VRAM. Then upon trying to infer; gml…

cornpo updated 3 weeks ago
4
ollama/ollama #7005

Docker not use GPU after idle

### What is the issue? After the model is cleared from the graphics card RAM, when it is run again, the model is not loaded to the graphics card RAM but runs on the CPU instead, which slows it down a…

phukrit7171 updated 2 months ago
7
ollama/ollama #5361

Ollama running very slow on Windows

### What is the issue? I have pulled a couple of LLMs via Ollama. When I run any LLM, the response is very slow – so much so that I can type faster than the responses I am getting. My system speci…

AbhisheakSaraswat updated 1 week ago
20
janhq/jan #3087

bug: Image recognition

### # - [ ] I have searched the existing issues ### Current behavior error log below btw. same model and same mmproject-file works with koboldcpp , may you can copy paste ;) ### Minimum repro…

kalle07 updated 1 month ago
6
vllm-project/vllm #7890

[Bug]: CUDA_VISIBLE_DEVICES not detected

### Your current environment The output of `python collect_env.py` ```text Collecting environment information... WARNING 08-27 02:59:41 cuda.py:22] You are using a deprecated `pynvml` packag…

paolovic updated 3 weeks ago
8
mlc-ai/mlc-llm #2882

[Bug] Failure to load model on macOS with error about TVM

## 🐛 Bug After converting Mistral-Large-2407 and trying to load the model for chatting or serving, the following error is presented: "(mlc-llm) USER@MBPM3MVLB ~ % mlc_llm serve /Users/USER/LLM/M…

vlbosch updated 2 months ago
2
vllm-project/vllm #6147

[Bug]: CUDA error when using multiple GPUs

### Your current environment ```text (vllm) nd600@PC-7C610BFD7B:~$ python collect_env.py Collecting environment information... /home/nd600/miniconda3/envs/vllm/lib/python3.10/site-packages/torch…

ndao600 updated 1 week ago
3
THUDM/ChatGLM2-6B #464

[BUG/Help] <title>

### Is there an existing issue for this? - [X] I have searched the existing issues ### Current Behavior 微调报错 ValueError: None is not in list $ sh train.sh 08/16/2023 16:29:49 - WARNING - __main__…

shnyyds updated 10 months ago
3
ggerganov/llama.cpp #8965

Bug: Adreno740 GPU device can't load model in Android system

### What happened? I tried to run llama.cpp in Samsug Galaxy Tab S9 Ultra,the Android System is Android13.and I have compiled these libraries accoding the guide.I used these libraries in my APK and…

FranzKafkaYu updated 1 month ago
4
vllm-project/vllm #6682

[Bug]: CUDA OOM error when loading another model after exiti…

### Your current environment ```text Collecting environment information... PyTorch version: 2.3.0+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A …

R-C101 updated 2 days ago
6

上一页 1...94 95 96 97 98 99 100...100 下一页

1000+ results for kv-cache-quantization

1000+ results
for kv-cache-quantization