Closed sashokbg closed 4 months ago
I am sorry I read more carefully the constraints part of the documentation and I found my answer:
https://lmql.ai/docs/language/constraints.html
A good combination of STOPS_AT and len(TOKENS) does the job like in the example:
"A story about life:[STORY]" \
where STOPS_AT(STORY, ".") and len(TOKENS(STORY)) > 40
Hello, I am not sure if this is an issue related to lmql decoder ot the underlying model itself.
I am running as a server an instance of Mistral (llama.cpp format) with the following params:
[Loading llama.cpp model from llama.cpp:/home/alexander/Games2/lmql/models/mistral-7b-v0.1.Q5_K_M.gguf with {'n_ctx': 4096, 'n_gpu_layers': 35, 'repeat_penalty': 1.2, 'temp': 0.8, 'device_map': 'auto'} ]
My client in the playground has the following lmql code taken from the examples in the docs:
Then the model keeps repeating over and over the same thing.
Is this related to the argmax function and "where not "\n"" part ?
Thank you for your help. Best Regards Aleks