replicate / cog-llama-template

LLaMA Cog template
Apache License 2.0
307 stars 52 forks source link

properly decode emoji #75

Closed daanelson closed 1 year ago

daanelson commented 1 year ago

Unlike words, emoji aren't simply the concatenation of their decoded tokens. The individual tokens that make up an emoji are decoded as �, so if we stream them users will see ���� instead of, say, 🔥.

This fixes that behavior for MLC and vLLM. Note that MLC was actually broken (returning �), vLLM was returning proper emoji but would first yield three empty strings.