vllm Search Results - Githubissues

1000+ results
for vllm

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

langchain-ai/langchain #23814

BadRequestError with vllm locally hosted Llama3 70B Model

### Checked other resources - [X] I added a very descriptive title to this issue. - [X] I searched the [LangGraph](https://langchain-ai.github.io/langgraph/)/LangChain documentation with the integrat…

Haxeebraja updated 1 day ago
2
NVIDIA/TensorRT-LLM #1873

tensorrt-llm llama3 slower then vllm(4bit quant)?

### System Info - nvidia：535.129.03 - cuda_version:12.4 - GPU:L40S - OS：Ubuntu 22.04.4 LTS（docker） - tensorrt-llm: 0.11.0.dev2024060400 ### Who can help? _No response_ ### Information …

bleedingfight updated 4 days ago
2
unslothai/unsloth #464

AWQ support

I have faced an error with the VLLM framework when I tried to inferencing an Unsloth fine-tuned LLAMA3-8b model... ### Error: (venv) ubuntu@ip-192-168-68-10:~/ans/vllm-server$ python -O -u -m vl…

anslin-raj updated 2 hours ago
15
sgl-project/sglang #391

vLLM import error

I'm getting the following import error: ``` sgl ➜ export CUDA_VISIBLE_DEVICES=4; python -m sglang.launch_server --model-path meta-llama/Llama-2-7b-chat-hf --port 30000 Traceback (most recent call…

jlin816 updated 1 month ago
6
ray-project/ray #45739

[DOC] Vllm example is not work

### Description make the vllm example with latest vllm version(v0.4.3) works, by follow the current example from https://docs.ray.io/en/master/serve/tutorials/vllm-example.html I got exception: ``…

vincent-pli updated 2 weeks ago
1
neuralmagic/AutoFP8 #23

Can AutoFP8 quantized MOE model inferenced with vlllm?（kv_ca…

I have seen that the AutoFP8 quantized models from Huggingface, especially Mixtral-8x7B-FP8 is supported by vllm. I am wondering if both kv_cache and weight quantized models quantized by AutoFP8 are …

IEI-mjx updated 3 days ago
3
ModelCloud/GPTQModel #124

[COMPAT] vLLM does not support quantized MoE except for Mixt…

Using vllm to infer the deepseek model encountered an error ``` [rank0]: self.mlp = DeepseekV2MoE(config=config, quant_config=quant_config) [rank0]: File "/home/root/.local/lib/python3.10/s…

Xu-Chen updated 2 days ago
7
FoundationVision/LlamaGen #3

ModuleNotFoundError: No module named 'vllm.engine.ray_utils'

ModuleNotFoundError: No module named 'vllm.engine.ray_utils' Please tell me vllm version，thanks

csdY123 updated 3 weeks ago
1
microsoft/MInference #13

[Question]: python run_vllm.py TypeError: 'type' object is …

### Describe the bug python run_vllm.py 2024-07-05 15:25:04,647 WARNING utils.py:580 -- Detecting docker specified CPUs. In previous versions of Ray, CPU detection in containers was incorrect. Plea…

junior-zsy updated 18 hours ago
3
MeetKai/functionary #213

Feature Request: Support for Additional vLLM Configuration S…

## Description I would like to inquire if there are any plans to support more configuration settings for vLLM, specifically related to RoPE scaling and theta adjustments. ## Background vLLM curre…

Luffyzm3D2Y updated 3 days ago
7

上一页 1...1 2 3 4 5 6 7...100 下一页

1000+ results for vllm

1000+ results
for vllm