Why max_model_len only 8192 when inferencing with vLLM for DeepSeek-V2-Chat?

deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

MIT License

3.47k stars 143 forks source link

Open ybdesire opened 2 months ago

ybdesire commented 2 months ago

And what is the max value of max_model_len for DeepSeek-V2-Chat?

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams

max_model_len, tp_size = 8192, 8