bheinzerling / bpemb

Pre-trained subword embeddings in 275 languages, based on Byte-Pair Encoding (BPE)
https://nlp.h-its.org/bpemb
MIT License
1.18k stars 101 forks source link

setup: ensure utf-8 encoding when reading README.md #40

Closed hartb closed 4 years ago

hartb commented 4 years ago

setup.py reads README.md in prep for build, and README.md is utf-8 encoded.

Specify the encoding when opening the file to avoid build failures in environments where the default locale isn't utf-8. For example, some container environments may default to non-UTF-8, ASCII-based encodings like 'ANSI_X3.4-1968'.

bheinzerling commented 4 years ago

Thanks!