sgl-project / sglang

SGLang is a fast serving framework for large language models and vision language models.
https://sgl-project.github.io/
Apache License 2.0
6.03k stars 500 forks source link

[Feature] Make vLLM optional in model code #1673

Open ByronHsu opened 4 weeks ago

ByronHsu commented 4 weeks ago

Motivation

This is a tracker of removing vLLM dependencies in general model code (not considering quantization). This is our current import from vLLM, and we want to remove all them.

from vllm.config import CacheConfig
from vllm.distributed import get_tensor_model_parallel_world_size
from vllm.model_executor.layers.rotary_embedding import get_rope
from vllm.model_executor.layers.vocab_parallel_embedding import (
   ParallelLMHead,
   VocabParallelEmbedding,
)

Tracker

vkc1vk commented 4 days ago

Just curious, are the following imports in model_runner.py also being considered for removal, in later stages

from vllm.config import DeviceConfig, LoadConfig
from vllm.config import ModelConfig as VllmModelConfig
from vllm.distributed import (
    get_tp_group,
    init_distributed_environment,
    initialize_model_parallel,
    set_custom_all_reduce,
)
from vllm.distributed.parallel_state import in_the_same_node_as
from vllm.model_executor.model_loader import get_model
from vllm.model_executor.models import ModelRegistry