phar-io / phive

The Phar Installation and Verification Environment (PHIVE)
https://phar.io
BSD 3-Clause "New" or "Revised" License
579 stars 43 forks source link

Installation issues with umlauts under Windows #237

Open ravage84 opened 4 years ago

ravage84 commented 4 years ago

Windows 7 PHP 7.1 Phive 0.13.2 Having a Windows user name with an umlaut in it, like mine, "Marc Würth"

With above environment, after installing a tool like PHPCS for example, it is not possible to execute the Phar.

D:\dev\xampp\htdocs\phpmd>php phive.phar install phpcs
Phive 0.13.2 - Copyright (C) 2015-2020 by Arne Blankerts, Sebastian Heuer and Contributors
Linking C:\Users\Marc Würth\.phive\phars/phpcs-3.5.5.phar to D:\dev\xampp\htdocs\phpmd\tools/phpcs.bat

D:\dev\xampp\htdocs\phpmd>tools\phpcs.bat
Could not open input file: C:\Users\Marc W├╝rth\.phive\phars/phpcs-3.5.5.phar

Using either --copy or --global works, though.

I guess the very brief and simple documentation at https://phar.io/#Usage could be improved to hint new users of PHIVE to such alternatives, when they encounter problems.

May be a "troubleshooting" section would help?

theseer commented 4 years ago

This certainly looks like an encoding problem to me, given that the "ü" gets mangeled into two chars...

I'm a bit reluctant to tag this as a bug but it certainly is not how it's supposed to work ;)

Even more so, I'm not happy with merely documenting it as "doesn't work" but I'd like to understand how and why that happens. I don't regularly use windows, so I'm a bit lost as to what the root cause of the encoding falling apart might be...

@ThomasWeinert Maybe you have any pointers? You're the only other windows user I know ;)

MacFJA commented 4 years ago

I did some tests yesterday.

It's as you said, an issue on Windows only.

The Windows console don't handle well non ASCII char in .bat file.

I didn't find many solutions to solve this, I fact I only found 2:

The issus with chcp 65001: its seem it require an additional font installation (from what I understand, it's because the default console font of Windows can't handle UTF-8 chars)

The issue with chcp 1252: it's not an universal solution, it will work only with European char.

More docs about: CHCP command, CHCP codes, Note about chcp 65001

theseer commented 4 years ago

Thanks, @MacFJA!

Let me get this straight: Windows, in 2020 (!), is not supporting UTF-8 by default. In 2020. Srsly?

:speak_no_evil:

But that aside, I'm not sure I'm understanding it anyhow: Shouldn't the encoding in itself be consistent? It's not like we're doing anything special here but using a path as we get it from the OS and put it into a .bat file. I'm not aware of mangling we might do. So how can it not work? ;-)

MacFJA commented 4 years ago

From what I understand, the issue is that the command interpreter (cmd.exe) use encoding of the local user (and it's not UTF-8). (some reference from Microsoft: https://devblogs.microsoft.com/commandline/windows-command-line-unicode-and-utf-8-output-text-buffer/)

After, I'm not sure that the Windows version I used was up-to-date 😅, it was merely a Windows in RDP used to check Internet Explorer compatibility

My test was as simple as:

The first line is executed, but I have a file not found on the second.


I just find sapi_windows_cp_conv() (added in PHP 7.1), I will try to play with it tomorrow to see if I can resolve this issue