Open ShawnHymel opened 5 months ago
Hi @ShawnHymel! The EOS stop tokens like </s>
are included in the raw bot output. So you can do quick checks like this:
from nano_llm import StopTokens
if text.endswith(tuple(StopTokens)):
print('EOS')
if any([stop_token in text for stop_token in StopTokens]):
print('EOS')
In e.g. web_chat.py, you have the following callback:
From what I can tell, this is called each time the LLM generates a token as part of a response to a prompt. How can you tell when the LLM is done generating tokens for a given prompt? Or should I set a simple timeout (e.g. "if no tokens generated in 0.5 sec, send 'done' signal").