Closed AntonioZC666 closed 3 months ago
That option got removed at some point but the help text wasn't updated. I don't think it ever had an effect that would have been useful from the commandline though.
Also unfortunately the commandline option stuff is sort of janky - it's basically all the the options that any of the examples has. So you'll see options that only some examples support when you do --help
- and the example you're currently running may not support that option at all.
That option got removed at some point but the help text wasn't updated. I don't think it ever had an effect that would have been useful from the commandline though.
Also unfortunately the commandline option stuff is sort of janky - it's basically all the the options that any of the examples has. So you'll see options that only some examples support when you do
--help
- and the example you're currently running may not support that option at all.
Thank you. And can I get the number of tokens for the text generated by each input? I looked up a lot of posts but none mentions this.
And can I get the number of tokens for the text generated by each input?
Do you mean the top however many most likely tokens? I don't think you can get that information from the commandline, but the server
example supports returning that information in queries if you ask for it (see the README in the examples/server
directory). Or of course there's always using llama.cpp as an API and you can do whatever you want when sampling.
This issue was closed because it has been inactive for 14 days since being marked as stale.
Hi! I'm a beginner in performance optimization and I am using llama.cpp as my workload recently. The model I used is llama-2-7b-chat.Q3_K_M.gguf.
I want to know the number of tokens for the text generated by each input. How can I get them?
In parameters menu, I see
--logits-all return logits for all tokens in the batch (default: disabled)
. However, when I used it I met error like this:error: unknown argument: --logits-all
.How can I get the imformation of tokens? Can parameter
--logits-all
help me? If so, how should I use it? Thank you.