xiaoyifang / goldendict-ng

The Next Generation GoldenDict
https://xiaoyifang.github.io/goldendict-ng/
Other
1.7k stars 95 forks source link

feat: write word to Program dictionary's stdin in UTF-8 instead of local 8 bit #1743

Closed shenlebantongying closed 2 months ago

shenlebantongying commented 2 months ago

One line of change. No impact for Unix, only impact today's Windows in rare situations.

The root problem is that Python since 3.6 assume stdin on Windows is UTF-8 since 2016 ^1.

It is impossible for normal user to figure out this issue and unlikely to find out what's his local code page and how to deal with it.

A programming language has to take extra care to make needed Windows API available, but the important languages simply don't care, including python^1, rust^5, go, java17+^4….

On high level, both Unix's locale dependent and Windows's code pages are 💩💩💩💩💩 that sane programmers generally avoid. In fact, Windows 11 default the code page to utf-8 ^3.

GD's original code assumes programs on Windows will use windows' code page 💩 to process data, but that's not true nowadays.

Since Python assumes stdin is UTF-8, I don't see why we shouldn't write stdin in UTF-8. This eliminates the rare Unicode error on Windows for Python.

In case of any encoding error on Windows for program dictionary, user can now deterministically and obviously know that what the root issue is and the direction of fixing it.

sonarcloud[bot] commented 2 months ago

Quality Gate Passed Quality Gate passed

Issues
0 New issues
0 Accepted issues

Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code

See analysis details on SonarCloud

xiaoyifang commented 2 months ago

Since Python assumes stdin is UTF-8, I don't see why we shouldn't write stdin in UTF-8. This eliminates the rare Unicode error on Windows for Python.

+1