Open ice1000 opened 4 years ago
You could submit a PR, it is probably easier for you to test the change as you have the right context (Windows, non-English locale).
This applies to non-Windows systems as well. If your locale is not set or set to something like LANG=ascii
, reading files with unicode in them will result in this error. In fact we received this error on Linux machine as well with this happy file: https://github.com/erikd/language-javascript/blob/eef1887d430c18b108ff723479c3f1ef50c0e9b2/src/Language/JavaScript/Parser/Grammar7.y
I fixed an issue exactly like this one with hpc
: https://gitlab.haskell.org/ghc/ghc/issues/17073
Same fix could be applied here. Haskell source files are always assumed to be encoded in utf-8, same principal could be applied to happy .y
files.
This has caused problems for people trying to build the purescript compiler from source too: eg https://github.com/purescript/purescript/issues/3813, https://github.com/erikd/language-javascript/issues/86. I think having happy always assume that .y
files are UTF-8 encoded would indeed be a good option.
struggled with this.. this combo worked for me
echo "LC_CTYPE=\"en_US.UTF-8\"" | sudo tee -a /etc/default/locale
echo "LC_ALL=\"en_US.UTF-8\"" | sudo tee -a /etc/default/locale
echo "LANG=\"en_US.UTF-8\"" | sudo tee -a /etc/default/locale
echo "LC_ALL=en_US.UTF-8" | sudo tee -a /etc/environment
echo "en_US.UTF-8 UTF-8" | sudo tee -a /etc/locale.gen
echo "LANG=en_US.UTF-8" | sudo tee -a /etc/locale.conf
sudo locale-gen en_US.UTF-8
Before doing this
readFile
:https://github.com/simonmar/happy/blob/27596ff0ce0171d485bf96d38943ffc760923c90/src/Main.lhs#L72-L74
we may do
hSetEncoding h IO.utf8
before.See https://github.com/agda/agda/issues/4161#issuecomment-548085906