microsoft / BotBuilder-Samples

Welcome to the Bot Framework samples repository. Here you will find task-focused samples in C#, JavaScript/TypeScript, and Python to help you get started with the Bot Framework SDK!
https://github.com/Microsoft/botframework
MIT License
4.34k stars 4.86k forks source link

Tokenizer in Choice Prompt fails on some characters #2611

Open johnataylor opened 6 years ago

johnataylor commented 6 years ago

The Tokenizer attempts to split input that includes emojis. This code works for the basic emoji characters (thumbs up, down, smile etc.) but fails on the multi-character emojis that include modifiers for things such as FitzPatrick scale skin tone.

Suggest we simplify the Tokenizer in the product code and move the partial implementation of emoji tokenization into sample. (The Tokenizer should already be an extensibility point - though test coverage for that is weak.)

Justification is that the feature is useful however it is a long way off being a complete implementation. Providing the code, as sample, and the extensibility point makes for a far clearer and more supportable platform. (Extended unicode would make a great sample and interesting blob post.)

johnataylor commented 4 years ago

this issue covers https://github.com/microsoft/botbuilder-js/issues/898 so closing the more specific issue.

johnataylor commented 4 years ago

The initial target here will be a sample.

Most likely factoring is a standalone function that returns emoji characters and their positions.

And then, perhaps additionally