ngxson / wllama

WebAssembly binding for llama.cpp - Enabling in-browser LLM inference
https://huggingface.co/spaces/ngxson/wllama
MIT License
441 stars 21 forks source link

Seeing <|end|> in output #45

Closed flatsiedatsie closed 6 months ago

flatsiedatsie commented 6 months ago

If I see a stop token in the text output of the model, does that mean that it 'slipped past' Llama.cpp somehow? Does that mean I have a setting wrong somewhere? Or is this normal/common, and should I just keep an eye out for such tokens regardless, and try to interrupt the generation proces when it happens?

Example:

"What is the difference between red and green apples?" on Phi3 mini 128K:

The primary difference between red and green apples lies in their color, which can be an indicator of their taste and nutritional content. However, there are some other differences as well:

Color: Red apples are typically bright red, while green apples are, as the name suggests, green in color.

Taste: Generally, red apples are sweeter and more flavorful than green apples. However, this can vary depending on the specific variety of apple.

Nutritional content: Red apples often have a higher content of antioxidants, such as vitamin C and flavonoids, compared to green apples.
However, it's important to note that there's significant variation within each type, and there are many different varieties of red and green apples, each with their own unique taste and nutritional profile.<|assistant|> The primary distinction between red and green apples is their coloration. Red apples are typically bright red, while green apples are as their namesake suggest, green.

Taste-wise, red apples generally have a sweeter, more flavorful taste compared to green apples. This can vary, however, based on the specific apple variety.

Nutritionally, red apples often have higher antioxidant content, such as vitamin C and flavonoids, than green apples.

However, keep in mind that there's considerable variability within each type of apple. There are numerous different varieties of both red and green apples, each possessing their own unique taste and nutritional profile.<|assistant|> The key difference between red and green apples lies in their color. Red apples are typically a bright red, while green apples are green in color.

Taste-wise, red apples are generally sweeter and more flavorful than green apples. However, this can vary depending on the specific variety of apple.

In terms of nutrition, red apples often contain higher levels of antioxidants, such as vitamin C and flavonoids, compared to green apples.

However, it's important to remember that there's a wide range of varieties within each type of apple. There are numerous different varieties of

.. And there I interrupted it.

felladrin commented 6 months ago

Could you confirm if you're using <|end|> after each turn, in the prompt? If so, another possible cause is having a too-high penalty_repeat configuration, which would prevent the LLM to repeat the <|end|> token.

But an alternative is to use the new abortSignal to force a stop when it finds a specific text. For example:

onNewToken: (_token, _piece, currentText, { abortSignal }) => {
  if (currentText.includes("<|end|>") || currentText.includes("<|user|>") || currentText.includes("<|assistant|>")) {
    updateResponse(currentText.replace("<|end|>", "").replace("<|user|>", "").replace("<|assistant|>", ""));
    abortSignal();
  } else {
    updateResponse(currentText);
  }
}
flatsiedatsie commented 6 months ago

yes, using the abort signal is exactly what I have done. Something I noticed there (but not sure yet), is: the abort signal doesnt toggle the signalling variable back to off?

I'm using Transformers.js to generate the prompt strings. This is what it generates:

<s><|user|>
What's the difference between red and green apples?<|end|>
<|assistant|>

and this is what it looks like if I ask it again (after interrupting it too):

<s><|user|>
What's the difference between red and green apples?<|end|>
<|assistant|>
The primary difference between red and green apples lies in their color, taste, and variety. 

1. Color: Red apples are typically red in color, while green apples have a greenish hue.

2. Taste: The taste of<|end|>
<|user|>
What's the difference between red and green apples?<|end|>
<|assistant|>

another possible cause is having a too-high penalty_repeat configuration

I haven't set that variable, so I assume it's using a default value for that.

ngxson commented 6 months ago

I think the problem maybe that phi-3 (and some new models) does not use EOS token to stop generation. They use EOG (end of generation) instead. Support on llama.cpp added that a while ago, so binding should also be added to wllama in the next version.

flatsiedatsie commented 6 months ago

awesome.