vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
28.22k stars 4.18k forks source link

Hope VLLM can support DeepSeek #1834

Closed zhaochenyang20 closed 10 months ago

zhaochenyang20 commented 10 months ago

Here is the model:

https://github.com/deepseek-ai/DeepSeek-LLM

esmeetu commented 10 months ago

The Deepseek model employs a llama-based architecture, with robust support from vLLM. Have you encountered any issues?

lgw2023 commented 5 months ago

@esmeetu @zhaochenyang20

deepseek-coder-33b-instruct and deepseek-coder-6.7b-instruct broken:

export model_path=/home/deepseek-ai/deepseek-coder-33b-instruct
export tokenizer_path=/home/deepseek-ai/deepseek-coder-33b-instruct
export model_dtype=float
export served_model_name=deepseek-coder-33b-instruct
export model_host=127.0.0.1
export model_port=32006 
export model_parallel=8
export other_parameters=" --max-num-seqs=256 --max-num-batched-tokens=16384 --block-size=32 --gpu-memory-utilization=0.9 --seed=0 --disable-log-requests"
python -m vllm.entrypoints.openai.api_server --tensor-parallel-size=${model_parallel} --served-model-name ${served_model_name} --model ${model_path} --trust-remote-code --tokenizer ${tokenizer_path} --dtype ${model_dtype} --host ${model_host} --port ${model_port} ${other_parameters}
export prompt='I love Beijing, because'
curl -X POST http://127.0.0.1:32006/v1/chat/completions \
 -H "Content-Type: application/json" \
 -d '{
     "model": "deepseek-coder-33b-instruct",
     "messages": [
         {
             "role": "user",
             "content": "'"$prompt"'"
         }
     ],
     "max_tokens": 100,
     "top_k": -1,
     "top_p": 1,
     "temperature": 0,
     "ignore_eos": false,
     "stream": false
 }'

deepseek-coder-33b-instruct return:

::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::

deepseek-coder-6.7b-instruct return:

\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n

ENV info:

vllm==0.1.7 torch==2.0.1  transformer==4.38.2  torch==2.0.1 cuda 11.4 

Any solution with this ? @WoosukKwon many thanks~

lgw2023 commented 5 months ago

@esmeetu @zhaochenyang20 deepseek-llm-7b-chat and deepseek-llm-67b-chat work well:

export model_path=/home/deepseek-ai/deepseek-llm-7b-chat
export tokenizer_path=/home/deepseek-ai/deepseek-llm-7b-chat
export model_dtype=float
export served_model_name=deepseek-llm-7b-chat
export model_host=127.0.0.1
export model_port=32006 
export model_parallel=8
export other_parameters=" --max-num-seqs=256 --max-num-batched-tokens=4096 --block-size=32 --gpu-memory-utilization=0.9 --seed=0 --disable-log-requests"
python -m vllm.entrypoints.openai.api_server --tensor-parallel-size=${model_parallel} --served-model-name ${served_model_name} --model ${model_path} --trust-remote-code --tokenizer ${tokenizer_path} --dtype ${model_dtype} --host ${model_host} --port ${model_port} ${other_parameters}
export prompt='I love Beijing, because'
export served_model_name=deepseek-llm-7b-chat
curl -X POST http://127.0.0.1:32006/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{
    "model": "deepseek-llm-7b-chat",
    "messages": [
        {
            "role": "user",
            "content": "'"$prompt"'"
        }
    ],
    "max_tokens": 100,
    "top_k": -1,
    "top_p": 1,
    "temperature": 0,
    "ignore_eos": false,
    "stream": false
}'

deepseek-llm-7b-chat return:

Beijing is a city with rich history, vibrant culture, and modern charm. Here are some reasons why someone might love Beijing:
1. Historical landmarks: Beijing is home to numerous iconic historical sites, such as the Great Wall of China, the Forbidden City, and the Temple of Heaven. These landmarks offer a glimpse into China's rich history and culture.
2. Cultural experiences: The city is a melting pot of diverse cultures, with traditional Chinese customs and practices coexisting alongside modern influences. Visitors can experience traditional Chinese music, dance, and cuisine while exploring the city's many museums, galleries, and theaters.
3. Shopping and markets: Beijing is a shopper's paradise, with numerous markets like the Silk Street, Wangfujing, and the Hepingmen Night Market. Here, visitors can find everything from traditional Chinese handicrafts to trendy fashion items.
4. Modern infrastructure: Despite its ancient history, Beijing boasts modern infrastructure, including efficient public transportation systems, modern shopping malls, and high-tech amenities.
5. Delicious cuisine: Beijing is famous for its mouthwatering local dishes, such as Peking roast duck, dumplings, and hot pot. Foodies will find endless culinary delights to explore in the city.
6. Green spaces: Despite its urban setting, Beijing has numerous green spaces, such as the Beijing Botanical Garden, the Olympic Park, and the Summer Palace. These parks offer a peaceful retreat from the bustling city life.
7. International connections: As the capital of China, Beijing is a hub for international trade and diplomacy. The city hosts numerous international conferences, exhibitions, and cultural events, making it a vibrant and dynamic place to be.
8. Sports and entertainment: Beijing is home to the National Stadium, also known as the Bird's Nest, which hosted the 2008 Olympic Games. The city also offers a variety of entertainment options, including theaters, cinemas, and live performances.
These are just a few reasons why someone might love Beijing. Its rich history, vibrant culture, and modern amenities make it a captivating destination for travelers and locals alike.

for deepseek-llm-67b-chat with model_dtype=half return:

I'm glad to hear that you love Beijing! As an AI language model, I don't have personal experiences or emotions, but I can provide you with some reasons why people might love Beijing:\n1. Rich history and culture: Beijing is the capital of China and has a long history dating back over 3,000 years. It is home to numerous historical sites, such as the Forbidden City, the Temple of Heaven, and the Summer Palace, which showcase China's

ENV info:

vllm==0.1.7 torch==2.0.1  transformer==4.38.2  torch==2.0.1 cuda 11.4