-
### Your current environment
PyTorch version: 2.4.0+cu121
Is debug build: False
CUDA used to build PyTorch: 12.1
ROCM used to build PyTorch: N/A
OS: Ubuntu 22.04.2 LTS (x86_64)
GCC version: (U…
-
### Checked other resources
- [X] I added a very descriptive title to this issue.
- [X] I searched the [LangGraph](https://langchain-ai.github.io/langgraph/)/LangChain documentation with the integrat…
-
### The model to consider.
https://huggingface.co/Qwen/Qwen2-VL-7B-Instruct
### The closest model vllm already supports.
https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/models/qw…
-
### Description & Motivation
[vLLM](https://github.com/vllm-project/vllm) is one of the most popular and effective tool for quick, large-scale LLM inference. Are there any existing examples of incorp…
-
```
llm_cfg = {
# Use the model service provided by DashScope:
'model': 'qwen-vl-max-0809',
#'api_key': 'YOUR_DASHSCOPE_API_KEY',
# It will use the `DASHSCOPE_API_KEY' environment…
-
### Your current environment
The output of `python collect_env.py`
```text
Collecting environment information...
WARNING 09-05 21:11:49 cuda.py:22] You are using a deprecated `pynvml` package.…
-
**Is your feature request related to a problem? Please describe.**
I'd like to be able to run vLLM emulating the OpenAI compatible API to use vLLM as a drop-in replacement of ChatGPT.
**Describe…
-
PyTorch is dead. Long live JAX.
https://neel04.github.io/my-website/blog/pytorch_rant/
LLM Compressor
https://github.com/vllm-project/llm-compressor
https://neuralmagic.com/blog/llm-compressor-i…
-
Hi Eric, great rust programm.
I am looking for a crate so I can use a chatbot function within my rust programm. I tried to to that with candle. I hope it will be more documented in den future.
…
-
### Describe the issue
```python
from vllm import LLM, SamplingParams
from minference import MInference
prompts = [
"Hello, my name is",
"The president of the United States is",
…