vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs
https://docs.vllm.ai
Apache License 2.0
26.83k stars 3.94k forks source link

[Bug]: how to set gpu id in code? #8264

Open cqray1990 opened 1 week ago

cqray1990 commented 1 week ago

Your current environment

as official code is: from vllm import LLM, SamplingParams

prompts = [ "Hello, my name is", "The president of the United States is", "The capital of France is", "The future of AI is", ] sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="facebook/opt-125m")

outputs = llm.generate(prompts, sampling_params)

Print the outputs.

for output in outputs: prompt = output.prompt generated_text = output.outputs[0].text print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

then i want to set GPU ID as i want,such as specific ID,one GPU or seral GPU.default set is device="auto", but how to set device="1,2,3"?

🐛 Describe the bug

as official code is: from vllm import LLM, SamplingParams

prompts = [ "Hello, my name is", "The president of the United States is", "The capital of France is", "The future of AI is", ] sampling_params = SamplingParams(temperature=0.8, top_p=0.95)

llm = LLM(model="facebook/opt-125m")

outputs = llm.generate(prompts, sampling_params)

Print the outputs.

for output in outputs: prompt = output.prompt generated_text = output.outputs[0].text print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

then i want to set GPU ID as i want,such as specific ID,one GPU or seral GPU.default set is device="auto", but how to set device="1,2,3"? when set os.environ['CUDA_VISIBLE_DEVICES'] = "0,1,2" it is invalid,is there any other ways?

Before submitting a new issue...

DarkLight1337 commented 1 week ago

Could you explain the rationale behind this? If you want to have a separate instance of vLLM on each GPU, it would be best to set CUDA_VISIBLE_DEVICES inside the command-line (instead of Python) and run each instance in a separate process.

mayankjobanputra commented 1 week ago

Could you explain the rationale behind this? If you want to have a separate instance of vLLM on each GPU, it would be best to set CUDA_VISIBLE_DEVICES inside the command-line (instead of Python) and run each instance in a separate process.

I tested this, it works perfectly :)

cqray1990 commented 1 week ago

Could you explain the rationale behind this? If you want to have a separate instance of vLLM on each GPU, it would be best to set CUDA_VISIBLE_DEVICES inside the command-line (instead of Python) and run each instance in a separate process.

@mayankjobanputra it only can be set by command-line? cause i want to regards the devices as function parameters to pass.