vllm Search Results - Githubissues

Marker-Inc-Korea/AutoRAG #512

vLLM terminated unexpectedly

Hello, While using the ELI5 and TriviaQA datasets from the Hugging Face library, I encountered errors related to missing documents that are not present in the corpus. I experienced a similar issue …

gnekt updated 1 week ago

noamgat/lm-format-enforcer #116

Emojis unsupported (vLLM integration)

Hi, using version 0.10.3 and the llama3 tokenizer, with vLLM, I can't seem to constrain to generate emojis. ``` curl --request POST \ --url http://localhost:8000/v1/chat/completions \ --hea…

milesial updated 6 days ago

sgl-project/sglang #508

Mac support? (vLLM)

noob here, does this mean no mac support, "AssertionError: vLLM only supports Linux platform (including WSL)."

shiftbug updated 1 day ago

uclaml/SPPO #5

Which version of vllm should be installed

Hi, when I follow the default steps to set up environment: pip install vllm it will automaticly install vllm 0.5.0.post1, and transformers>=4.40.0 is required. When installing SPPO ( transformer…

xinghuang2050 updated 2 days ago

huggingface/optimum-quanto #220

VLLM Supported?

I wonder is quanto-quantized model available using vllm?

RanchiZhao updated 5 days ago

prometheus-eval/prometheus-eval #43

Error trying to load with VLLM

I'm trying to run/load prometheus on Amazon Sagemaker Stuidio notebooks but keep running into errors. If I load it using VLLM `model = VLLM(model="prometheus-eval/prometheus-7b-v2.0")` `ValueErro…

VanHoang85 updated 1 week ago

run-llama/llama_index #12955

llamaindex vllm error

### Feature Description from llama_index.core.llms.vllm import VllmServer from llama_index.core.llms import ChatMessage llm = VllmServer(api_url="http://localhost:8000", max_new_tokens=8000, temp…

union-cmd updated 6 days ago

otriscon/llm-structured-output #2

vllm support

Hello, nice work and very helpful! Does this support vllm for fast generation?

merlinarer updated 1 month ago

LLMServe/DistServe #19

codellama34b ttft延迟问题

你好，在最近的测试中，我在A100上测试Llama-13b、7b等模型，对比vllm和distserve, 在满足slo的情况下， distserve性能要优于vllm，但是在测试codellama-34b过程中，当我的输入长度为8192，发现TTFT要高出vllm约3倍左右，请问这个情况是正常的吗？vllm使用tp2, distserve使用prefill tp2, decode tp2。

sitabulaixizawaluduo updated 9 hours ago

instructlab/instructlab #1456

Add vLLM functional test in CI

While working on the addition of vLLM https://github.com/instructlab/instructlab/pull/1442, I tried adding func test to the e2e test since the runner has a CUDA GPU. Unfortunately, it does not have en…

leseb updated 1 day ago

1000+ results for vllm

1000+ results
for vllm