When I use this grammar in inference params, on the second inference it fails with an assert "GGML_ASSERT: D:\a\LLamaSharp\LLamaSharp\llama.cpp:6649". You can find the assert here.
Is there a way I can get the grammar to run twice for inferences in quick succession?
@martindevans suggested parsing the grammar again instead of re-using the parsed grammar and it worked! It's a temporary fix, but it operates as expected.
I am using the grammar below with OpenHermes-2.5-Mistral-7B-16k-GGUF.
I read and parse the grammar in a few lines of code using llamasharp.
When I use this grammar in inference params, on the second inference it fails with an assert "GGML_ASSERT: D:\a\LLamaSharp\LLamaSharp\llama.cpp:6649". You can find the assert here.
Is there a way I can get the grammar to run twice for inferences in quick succession?