lmg-anon / mikupad

LLM Frontend in a single html file
https://lmg-anon.github.io/mikupad/mikupad.html
Creative Commons Zero v1.0 Universal
175 stars 24 forks source link

[Feature Request] Phrase bias #58

Open neCo2 opened 2 months ago

neCo2 commented 2 months ago

With the implementation of token bias, the next step would now be phrase bias. The tokenize/detokenize functions which are used for the token bias implementation, of course, provide the tokenizations of the entire input string, but currently only the first token of multi-token strings is biased.

As I see it, the way to implement this would be the following (for negative bias):

  1. Remove all multi-token strings from the normal bias function
  2. Monitor the streamed tokens for the multi-token strings, stop generation when one of the strings is found, and remove the biased string from context
  3. bias the first token of the string (possibly the remaining tokens of the string as well)
  4. generate the number of tokens of the biased string, then stop again
  5. remove biases from the string
  6. resume normal generation

As for positive bias, I'm not quite sure how it should be implemented. Possibly biasing the first token, stopping when it's generated, then biasing the rest of the tokens?

I'm working on it, but the fact that most of my knowledge of React comes from reading the source of this project is coming back to bite me in the ass, as I can't get my state variables to update correctly. I'll probably have to read up on a lot of shit before I can get it anywhere near functional. The code I've written so far's fair game if anyone wants to have a go at it. (Though you'd probably be better off just starting over.)

neCo2 commented 2 months ago

Dropping these here for future reference, since as far as I understand it, this might allow for a much smoother and more performant way of implementing phrase bias. oobabooga/text-generation-webui#5677 ggerganov/llama.cpp#6839