vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
27.74k stars 4.1k forks source link

[Misc]: How can I pass `system_prompt` on initialization!? #7847

Open NITHISH-1609 opened 1 month ago

NITHISH-1609 commented 1 month ago

Anything you want to discuss about vllm.

So, I run the model using the following Docker command:

docker run -d \
  --name vllm-container \
  -p 8000:8000 \
  --shm-size 2G \
  -e HF_TOKEN=hf_\
  vllm/vllm-openai:v0.5.2 \
  --model=name \
  --max-model-len=3072

How can I pass the system_prompt here? It's a very long text (~2000 characters). We can pass it with each API request (4497)!, but I feel that's not very efficient!

So, I'm thinking of adding this in the startup command itself, but I can't find it in the docs. I prefer adding it through a file, like --system_prompt=system_prompt.txt. Is something like this possible?

Before submitting a new issue...

w013nad commented 1 month ago

You could edit the prompt template in tokenizer_config to include your system prompt. Then when you start up the model it'll read from that.