Closed hvisser closed 7 months ago
Thanks for the pull request! I changed it slightly to keep the default behavior, but allow to set tokenize_special
via the InferenceParameters
like:
InferenceParameters inferParams = new InferenceParameters().setTokenizeSpecial(true);
// ...
model.generate(prompt, inferParams)
Thanks, though I wonder why one would ever not want to tokenize these special tokens if the model has them. The whole point of adding these "special" tokens is to treat a sequence as a single token as I understand it. Moving this to the inference parameters puts the burden on the user of the library and since I was looking for the issue for a few hours, I suspect this isn't obvious for anyone encountering the same.
I agree, per default most users will want to tokenize these tokens. My reasoning was to stick to the default behavior of llama.cpp, where the parameter is false by default. I guess it's useful if you want to talk about those tokens as if they were text (or get answers containing them), without triggering their special functionality. I'll change the default to true in the Java binding, though.
Changing the default is a good solution! Thanks again 😁 It might be false by default, but llama.cpp main example always sets it to true as well. So having that same default allows for comparing the two.
Fixes #45