serokell / haskell-with-utf8

Get your IO right on the first try
https://serokell.io/blog/haskell-with-utf8
52 stars 3 forks source link

`withUtf8` should also `setFileSystemEncoding` #8

Closed temyurchenko closed 2 years ago

temyurchenko commented 4 years ago

This "file system encoding" is responsible for decoding command line arguments and environment variables (source). If we make an assumption that stdin is encoded in utf-8, it's fair to assume that command line arguments also have the same encoding.

I have doubts about environment variables, thought.

The issue has arisen in this discussion.

Doing setFileSystemEncoding in withUtf8 initialization seems to be sufficient to fix the problem.

If you think it's a good idea, I can make a PR.

kirelagin commented 4 years ago

withUtf8 does not assume that stdin is encoded in utf-8, because this assumption is not safe to make. It assumes that stdin is encoded as “terminal encoding” (i.e. the one from the locale), unless stdin is redirected to a file, in which case it assumes UTF-8.

It is never safe to just assume that “terminal encoding” is something different than the one from locale, and thus it is not safe to set it to anything but the default.

Can you, please, provide more details of the problem with command line arguments that you are trying to solve?

temyurchenko commented 4 years ago

LC_ALL=C ./my-command "non-ascii-argument".

The problem is how the "non-ascii-argument" is interpreted by my-command. I thought it would make sense to decode it as if it was encoded in utf-8 when using Main.Utf8.withUtf8.

I've heard your arguments and I see that I'm wrong.

Martinsos commented 2 years ago

Would it make sense to close this issue, it seems to be resolved?