Open PeterSH6 opened 7 months ago
This is already supported through sampling params and OpenAI compatible API as of v0.3.2
Closed by #2514
It seems that the latest version support per-request seed. But it may still have indeterminacy
When using torch.use_deterministic_algorithms(True)
, the pytorch will get error as the cumsum()
operation in https://github.com/vllm-project/vllm/blob/main/vllm/model_executor/layers/sampler.py#L205, does not have deterministic cuda kernel.
Therefore, the current version may not be 'really deterministic', is it possible to bypass this operation? @simon-mo
Good point. It seems it is still unresolved on the PyTorch side: https://github.com/pytorch/pytorch/issues/75240
Hi vllm maintainers,
Thanks for the awesome project!
I'm wondering is there a deterministic option/flag to let the model generate identical results in different runs with the same prompts? (Also support random and beam search sampler, not only greedy sampler) Does it enough to get deterministic behavior by setting the following random state? I'm not sure what other factors will violate the determinism.
CC: @WoosukKwon @zhuohan123 @Yard1