Open brankoradovanovic-mcom opened 1 month ago
Can confirm. Also, top_p = 0.9999 removes the determinism. A top_p = 1 is only supposed to disable top_p processing (effectively passing all presented logits to the next sampler in chain) and not affecting other sampling in the chain. For me I am using version 3.4.0, macos Sonoma 14.6.1.
Bug Report
When Top-P is set to 1, chat responses are fully deterministic, which shouldn't be the case.
Steps to Reproduce
Can you write a story about a bear and a fox?
Expected Behavior
In 3.2.1, redoing the last chat response with the above settings produces significantly different responses each time.
Your Environment