Closed Webifi closed 1 year ago
Hi @Webifi,
Thanks for letting us know, we may have a bug in token-by-token decoding on https://chat.petals.dev
I'll try to reimplement it exactly as in this demo space once I'm back from vacation (or feel free to do a pull request yourself).
@borzunov Seems to be a limitation of SentencePiece. I put in a work-around in PR #31.
@Webifi Thanks for taking the time to fix this! This seems to be resolved in #31, but feel free to continue this discussion in case of further issues.
Seems Unicode characters in responses for all models on chat.petals.dev are generally butchered.
Ask it to "show an emoji", translate something to Japanese, etc., and it will usually return a fair amount of \ufffd characters instead of the correct Unicode.
The hugging face llama2 chat demo does respond correctly: https://huggingface.co/blog/llama2#demo