-
Copy-pasting code results from poor design and architecture, leading to technical debt and unmaintainable code.
Lines near the start:
src/instructlab/data/generate.py:186: backend = b…
-
Thanks for publishing this customized version of vllm.
According to the readme.md, I tried to install it and found some problems.
The error message is as follows:
```
Building wheels for collecte…
-
### Issue Kind
Change in current behaviour
#### Description:
I have encountered an issue with Poetry's handling of platform-specific dependencies, particularly when using environment markers …
-
### Question Validation
- [X] I have searched both the documentation and discord for an answer.
### Question
I didn't find how i can delete the model.
with the default Vllm:
```python
import gc…
-
Support for VLLM, how is the progress of the work?
-
I had an existing config and switched my serve backend setting to `vllm`. This was using a quantized version of Mixtral in gguf format.
The error I got didn't make it obvious at all that a gguf was…
-
Based on practical tests, deploying omost-llama-3-8b on an A100 using torch==2.3.0+cu118, vllm==0.5.0.post1+cu118, and xformers==0.0.26.post1+cu118 works well. if want to speed up the process, can ref…
-
Is there comparison performance data between ScaleLLM and vLLM
-
机器A800,vLLM 0.5.0,prompt是开始,输出max tokens=2048,temperature设0.7
vLLM加载Qwen2-72B-Instruct-gptq-int4,使用vLLM的benchmark脚本来做并发测试,无论是1个并发限制还是10个并发限制,输出均会重复。
https://github.com/vllm-project/vllm/blob/main/…
-
# 【FML】好消息:我上了 大佬 repo 的 meetup...
好消息:我上了 vLLM meetup… 坏消息:是公开处刑 vLLM 在四月组织了一次 meetup,分享了他们的项目进展,roadmap 以及 benchmark,其中也包括与同类项目的比较,LMDeploy 也被选作是比较的对象之一。 得益于我的大佬同事们的努力,LMDeploy
[https://grimoire.g…