When loading a model with release 0.3.1 and set the seed to either -1 (random) or leave the default (which is supposed to use a RNG according to the docs), the first reply of a model will always be the same for the same prompt. Consecutive replies for the same prompt will be different, but the chain will stay the same for each time loading the model and repeating the steps.
This points towards the seed not being randomized on load.
What I expect and what worked earlier: Loading a model with a random seed will generate a different first reply for the same prompt.
The issue is not present in version llama-cpp-python==0.2.90
INB4 This is not about setting Top-P to 1.0 which causes the same output every time for every same prompt, documented here: https://github.com/abetlen/llama-cpp-python/issues/1797
When loading a model with release 0.3.1 and set the seed to either -1 (random) or leave the default (which is supposed to use a RNG according to the docs), the first reply of a model will always be the same for the same prompt. Consecutive replies for the same prompt will be different, but the chain will stay the same for each time loading the model and repeating the steps.
This points towards the seed not being randomized on load.
What I expect and what worked earlier: Loading a model with a random seed will generate a different first reply for the same prompt.
The issue is not present in version
llama-cpp-python==0.2.90