Strange llama.cpp_HF bug where the first token is chosen randomly

Describe the bug

It seems it forces sampling the first token before the context has finished processing or something along those lines. Not sure if it applies to the regular llama.cpp backend or just llama.cpp_HF. I never get this on koboldcpp.

Is there an existing issue for this?

[X] I have searched the existing issues

Reproduction

Seems inconsistent, couldn't find a common trigger yet, reloading sometimes fixes it. It's odd. Maybe some race condition or something where it samples before the model is ready?

Screenshot

Logs

N/A

System Info

N/A

oobabooga / text-generation-webui