Closed SkanderHellal closed 8 months ago
Issue: The OpenChat API appears to exhibit non-deterministic behavior in RAG applications using the same user query.
Details When using the same user query, I get different answer results. In fact, I am not getting the same output texts with the same user query.
Versions vllm==0.3.2 ochat=3.5.1
Openchat configuration
model='openchat/openchat-3.5-1210', tokenizer='openchat/openchat-3.5-1210', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=8192, download_dir=None, load_format=auto, tensor_parallel_size=2, disable_custom_all_reduce=True, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, seed=123
@SkanderHellal Thanks for your PR. Can you check if the issue still happens?
@imoneoi Thank you for your reply, now openchat has a deterministic behavior with seed fix. I close the issue.
Issue: The OpenChat API appears to exhibit non-deterministic behavior in RAG applications using the same user query.
Details When using the same user query, I get different answer results. In fact, I am not getting the same output texts with the same user query.
Versions vllm==0.3.2 ochat=3.5.1
Openchat configuration
model='openchat/openchat-3.5-1210', tokenizer='openchat/openchat-3.5-1210', tokenizer_mode=auto, revision=None, tokenizer_revision=None, trust_remote_code=False, dtype=torch.float16, max_seq_len=8192, download_dir=None, load_format=auto, tensor_parallel_size=2, disable_custom_all_reduce=True, quantization=None, enforce_eager=False, kv_cache_dtype=auto, device_config=cuda, seed=123