-
### What happened?
`GGML_CUDA_ENABLE_UNIFIED_MEMORY` is documented as automatically swapping out VRAM under pressure automatically, letting you run any model as long as it fits within available RAM…
-
### What happened?
llama.cpp使用QWen2.5-7b-f16.gg在310P3乱码
### Name and Version
./build/bin/llama-cli -m Qwen2.5-7b-f16.gguf -p "who are you" -ngl 32 -fa
### What operating system are you seeing the …
-
Hi
After the recent update, my deepimageJ bundled model as part of a software package is not working anymore.
You can download the model here:
`https://zenodo.org/records/10460434` and get th…
-
### Description
model https://huggingface.co/Qwen/Qwen2.5-Coder-7B-Instruct-GGUF/blob/main/qwen2.5-coder-7b-instruct-q5_k_m.gguf
Generate using GPU source code
Run the source code example LLama Ker…
-
Hi,
Thank you for this work.
I would like to enquire about MIA-Efficiency computation.
The following snippet
```
svc_mia_forget_efficacy = SVC_MIA(
shadow_train=shadow_train_loader…
-
## Issue Description
Hello everyone, I hope you're doing well. I'm trying to implement Differential Privacy (DP) with Opacus in a federated learning training structure where, in each round, each cl…
-
### Jan version
0.5.7
### Describe the Bug
Using Jan v0.5.7 on a Mac with an M1 processor, running Llama 3.2 3B instruct q8 via the API. Occasionally, the server stops responding to POST requ…
-
Version 2.0 does not start, displaying the error "Unhandled exception in script.". Details:
Failed to execute script 'main' due to unhandled exception: 'cp932' codec can't encode character 'u2705' …
-
With [HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1](https://huggingface.co/HIT-TMG/KaLM-embedding-multilingual-mini-instruct-v1), I have the following error:
```
Loader not specified for m…
-
### Your current environment
vllm version = 0.6.1
### Model Input Dumps
_No response_
### 🐛 Describe the bug
The output of `command:`
vllm version = 0.6.1. InternVLChat is in lis…