Encode tokenizer outputs as utf8 before printing them.

Fixes https://github.com/nod-ai/SHARK-TestSuite/issues/105

These tokenizers sometimes output characters that cause trouble when printing with the default encoding on Windows, so explicitly encode as utf8. Not all tests currently generate such characters, but the extra safety seems helpful.

UnicodeEncodeError: 'charmap' codec can't encode characters in position 82-83: character maps to <undefined> (those are the \xef\xbf characters, unicode "replacement characters": https://stackoverflow.com/a/11162470, https://en.wikipedia.org/wiki/Specials_%28Unicode_block%29)

An alternate approach is to set the environment variable PYTHONIOENCODING=utf-8 or PYTHONUTF8=1.

nod-ai / SHARK-TestSuite

Encode tokenizer outputs as utf8 before printing them. #247