mistralai / mistral-common

Apache License 2.0
651 stars 69 forks source link

[BUG: is the documentation example code outdated? #63

Open JINO-ROHIT opened 2 weeks ago

JINO-ROHIT commented 2 weeks ago

Python -VV

3.11

Pip Freeze

-

Reproduction Steps

running the example snippet from tokenization guide throws an error - https://docs.mistral.ai/guides/tokenization/

running this line -

tokenizer = MistralTokenizer.v3(is_tekken=True)

throws an error

UnicodeDecodeError: 'charmap' codec can't decode byte 0x81 in position 32549: character maps to <undefined>

Expected Behavior

no error

Additional Context

No response

Suggested Solutions

No response