utilityai / llama-cpp-rs

139 stars 42 forks source link

Proposal: Expose `llama_get_logits`-based method on Context in high level API #505

Open brittlewis12 opened 1 day ago

brittlewis12 commented 1 day ago

Today, llama-cpp-2 exposes llama_get_logits_ith-based candidates_ith.

In working with the underlying library, I have leaned on the slightly different llama_get_logits. It seems that while the underlying library accepts -1 as input to the -ith variant to accomplish the same as llama_get_logits, the current implementation of safely ensuring logits are initialized for the given llama_pos disallows -1 as input as a side effect.

Honestly, the existing implementation is sound as is, and I'd prefer to use the other, slightly simpler seam either way.

I took a quick stab at what this could look like, based on the existing pattern of a lower-level unsafe call and a higher level wrapper that returns an unsorted LlamaTokenData iter for you.

I tweaked the existing simple example to use this method without issue:

-            let candidates = ctx.candidates_ith(batch.n_tokens() - 1);
+            let candidates = ctx.candidates_last();

Thank you as ever for your maintenance efforts here @MarcusDunn!


a little more background via llama.cpp:

Token logits obtained from the last call to llama_decode() The logits for which llama_batch.logits[i] != 0 are stored contiguously in the order they have appeared in the batch. Rows: number of tokens for which llama_batch.logits[i] != 0 Cols: n_vocab

MarcusDunn commented 19 hours ago

whoops, thought https://github.com/brittlewis12/llama-cpp-rs/pull/1 was a PR, didn't mean to review your fork!

brittlewis12 commented 19 hours ago

no problem 😆 will open it up properly on your end

brittlewis12 commented 19 hours ago

re: https://github.com/brittlewis12/llama-cpp-rs/pull/1#pullrequestreview-2323689876 would removing “last” from the names, so get_logits & just candidates, be more in line with your preferred naming conventions?

MarcusDunn commented 18 hours ago

yeah, that matches what I've done elsewhere more closely.