Windows hfllama7b `test_repeat_calls` failing in PR gate

The bug basic-tests-win (3.9, hfllama7b) (as well as 3.10, 3.11, 3.12) is failing in the PR gate due to the tests/model_specific/test_llama_cpp.py::test_repeat_calls test:

    def test_repeat_calls(llamacpp_model: guidance.models.Model):
        llama2 = llamacpp_model
        a = []
        lm = llama2 + "How much is 2 + 2? " + gen(name="test", max_tokens=10)
        a.append(lm["test"])
        lm = llama2 + "How much is 2 + 2? " + gen(name="test", max_tokens=10, regex=r"\d+")
        a.append(lm["test"])
        lm = llama2 + "How much is 2 + 2? " + gen(name="test", max_tokens=10)
        a.append(lm["test"])
>       assert a[-1] == a[0]
E       assert "(Just so we're clear, we'" == 'What is 2 + 2?'

System Info guidance version: technically 0.1.15, but current HEAD of main OS: Windows (Large_Windows_0399a097b16f workflow runner)

guidance-ai / guidance

Windows hfllama7b `test_repeat_calls` failing in PR gate #913