vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

redhat-et/ilab-on-ocp #95

Clean up `launch, stop vLLLM` code

* In the evaluation steps, `vLLM` is launched locally to serve the candidate models. The code for doing this is not ideal. We (I) stopped at "it works" and we now need to go back and clean it up. * …

sallyom updated 1 week ago
1
bd-iaas-us/vllm #18

[Bug]: I get an error when I try to build the Docker Image f…

### Your current environment ```text PyTorch version: 2.1.1+cu121 Is debug build: False CUDA used to build PyTorch: 12.1 ROCM used to build PyTorch: N/A OS: Microsoft Windows 11 Home GCC vers…

mkhammoud updated 2 days ago
1
thu-ml/SageAttention #28

compatible with other quantization methos

As described in the research paper, sage attention is quantize QKV to be dtype of INT8 and conduct gemm in INT8 (with accumulator of dtype float16), so I want to know is it conflict or replicated with…

chenchunhui97 updated 2 days ago
2
SalesforceAIResearch/xLAM #13

An error occurred when vLLM deployed the GGUF file

Hi! This is a great job. I have tried using the vLLM deployment model. The vLLM service can be started normally, but the following error occurs when the service is invoked. openai.BadRequestError: Er…

bingoohe updated 1 month ago
2
vllm-project/vllm #3107

install vllm problem:

## dependency probelm install `vllm==0.3.2+cu118` through `pip install https://github.com/vllm-project/vllm/releases/download/v${VLLM_VERSION}/vllm-${VLLM_VERSION}+cu118-cp${PYTHON_VERSION}-cp${…

gfhe updated 1 day ago
3
logikon-ai/cot-eval #62

mistralai/Mistral-Nemo-Instruct-2407 vllm-server crashes

vllm server consistently crashes while processing lm-eval requests: ``` INFO 10-01 09:52:39 engine.py:288] Added request cmpl-270a6c19d13b4fb6aac151b9c8ba44c2-0. ERROR 10-01 09:52:48 client.py:24…

ggbetz updated 1 month ago
1
vllm-project/vllm #4695

[Bug]: VLLM + tritonserver

### Your current environment Python platform: Linux-5.10.213-201.855.amzn2.x86_64-x86_64-with-glibc2.35 Is CUDA available: True CUDA runtime version: 12.3.107 CUDA_MODULE_LOADING set to: LAZY GPU…

dlopes78 updated 1 week ago
1
vllm-project/vllm #9832

[Usage]: ValueError: Unexpected weight for Qwen2-VL GPTQ 4-b…

### Your current environment ```text The output of `python collect_env.py` WARNING 10-30 12:11:37 _custom_ops.py:19] Failed to import from vllm._C with ModuleNotFoundError("No module named 'vllm.…

bhavyajoshi-mahindra updated 2 days ago
14
vllm-project/vllm #9809

[Installation]: `Dockerfile.rocm` requires a torch nightly b…

### Your current environment (current environment is irrelevant because this is a replacement for the nightly build reference) ### How you are installing vllm ```sh git clone cd vllm git checko…

JamesKunstle updated 4 days ago
1
intel-analytics/ipex-llm #12081

vLLM 0.5.4 failure to start the TP+ PP mode on 8 ARC

### The vllm docker image is `intelanalytics/ipex-llm-serving-xpu-vllm-0.5.4-experimental:2.2.0b1` ### vLLM start command is 'model="/llm/models/Qwen2-72B-Instruct/" served_model_name="Qwen2-72B…

oldmikeyang updated 1 month ago
2

上一页 1...14 15 16 17 18 19 20...100 下一页

1000+ results for vllm

1000+ results
for vllm