syntaqx / git-hooks

A collection of git hooks for use with pre-commit
MIT License
34 stars 17 forks source link

Fix shfmt when paths have special characters or spaces #3

Closed asottile closed 4 years ago

asottile commented 4 years ago

Originally from https://github.com/pre-commit/pre-commit/issues/1229

kvalkhof75 commented 4 years ago

I had some figuring out to do to use your suggested changes in shfmt.sh. In addition to your change I forced set -x for debugging. Here are my findings on my system where I use bash from WSL (as Ubuntu 18.04 LTS):

Without my changes:

(venv) C:\Workspace\pre-commit>pre-commit run shfmt --file bash_script.sh
Check shell style with shfmt.............................................Failed
hookid: shfmt

/bin/bash: C:UsersCase.cachepre-commitrepomaxphuxohooksshfmt.sh: No such file or directory

Notice that the problem is not in shfmt.sh because bash cannot even find the shell script. This is the problem that I was trying to address. I was debugging the pre-commit Python code and in file parse_shebang.py function normalize_cmd variable exe was filled with the string: "C:\Users\Case.cache\pre-commit\repomaxphuxo\hooks\shfmt.sh" which is eventually passed to bash. And the slashes are incorrectly seen as escape characters, as shown in the output.

Using my changes:

(venv) C:\Workspace\pre-commit>pre-commit run shfmt --file bash_script.sh
Check shell style with shfmt.............................................Passed
hookid: shfmt

+ readonly DEBUG=unset
+ DEBUG=unset
+ '[' unset '!=' unset ']'
+ command -v shfmt
+ cmd=(shfmt "$@")
+ readonly cmd
+ echo '[RUN] shfmt' -i 4 -w -s bash_script.sh
[RUN] shfmt -i 4 -w -s bash_script.sh
++ shfmt -i 4 -w -s bash_script.sh
+ output=
+ readonly output
+ '[' -n '' ']'
+ echo '[PASS]'
[PASS]

Note that I changed .pre-commit-config.yaml because your suggested change made me find a bug which I cannot push to my branch in my repo from where I currently am.

-   repo: https://github.com/syntaqx/git-hooks
    rev: v0.0.16
    hooks:
    -   id: shfmt
        verbose: true
        args: [-i, "4", -w, -s]

Can I provide you with more info?

syntaqx commented 4 years ago

Is this breaking only from this merge or the revision tag? I can revert if this is breaking anyone currently until I have more time to look into this, a bit swamped at the moment though.

asottile commented 4 years ago

so you have windows python but WSL bash? that mixing isn't supported I'm afraid -- you'll want to use (windows, windows) or (WSL, WSL)

kvalkhof75 commented 4 years ago

I have Python in WSL as well, but for Windows I use the Python from Windows, and when I am in WSL, then I use the WSL Python.The only thing that I wanted is to use bash in Windows just as I use bash in WSL. And the method I came up with seems to work with MSys64 as well and will probably work with cygwin.Is there I Windows bash that I missed? Am I understanding the purpose of the bash hook? My understanding was that pre-commit is supposed to be as OS agnostic as possible. All other hooks that I used are. I can probably even rewrite the shfmt hook not to use bash, as there actually is an shfmt executable for windows. But for clang-format it will be harder. And why bother? with a small change I made the bash hook compatible with windows, and the existing hooks work out of the box (I did test clang-format as well.) And indeed, all hooks that I have set work when I do a git commit ... from WSL. But it would be so nice if I could achieve the same from the windows command prompt. So I made a POC or MVP...

Now I raised your attention a bit early by accidentally making a pull request while finding out how Github works. (I normally use Gitlab.)

So, I need to understand the purpose of the bash hook in windows. Is that supposed to work from the windows command prompt? Or did I miss something else? Is it a nice feature if bash hooks can be started from the windows command prompt?

asottile commented 4 years ago

from what I gather there's something strange about your setup -- bash hooks work fine for the default installation of python and git on windows (and the setup that runs in azure pipelines's windows builders)

I'm not sure why your bash isn't handling the windows paths like the bash on the windows machines I have access to is

kvalkhof75 commented 4 years ago

Ok, just to make sure I understand it: The bash you work with correctly convert Windows paths to POSIX so that: C:\>bash -c C:\Users\kvalkhof\shfmt.sh will work?

Output on my computer, which is as I expected:

C:\Workspace>bash -c C:\Users\kvalkhof\.cache\pre-commit\repoqvypm_m7\hooks\shfmt.sh
/bin/bash: C:Userskvalkhof.cachepre-commitrepoqvypm_m7hooksshfmt.sh: command not found

C:\Workspace>c:\msys64\usr\bin\bash.exe -c C:\Users\kvalkhof\.cache\pre-commit\repoqvypm_m7\hooks\shfmt.sh
/usr/bin/bash: C:Userskvalkhof.cachepre-commitrepoqvypm_m7hooksshfmt.sh: command not found

Or are we still misunderstanding each other?

asottile commented 4 years ago

precisely

try which -a bash?

kvalkhof75 commented 4 years ago
C:\Workspace>which -a bash
/c/Windows/system32/bash
/usr/bin/bash

If I am correct, the WSL bash and the MSys64 bash respectively.

asottile commented 4 years ago

ah yep, for me it just gives /usr/bin/bash :thinking:

kvalkhof75 commented 4 years ago

Could the way Git is installed on Windows have any influence. I am pointing to the choice the Git installer gives for the shell that should be used. I will try to test that if I can find some time tonight.

Mmm, /usr/bin/bash could be one of many bashes installed on Windows. Git bash, MSys bash, ...

WSL:

C:\Workspace>bash
kvalkhof@NLD-IND-L1510:/mnt/c/Workspace$ which -a bash
/bin/bash

MSys64:

C:\Workspace>\msys64\usr\bin\bash.exe

kvalkhof@NLD-IND-L1510  /c/Workspace
# which -a bash
/c/WINDOWS/System32/bash
/usr/bin/bash

Git bash:

C:\Workspace>"\Program Files\Git\git-bash.exe"

kvalkhof@NLD-IND-L1510 MINGW64 /c/Workspace
$ which -a bash
/usr/bin/bash
/bin/bash
/usr/bin/bash
/c/WINDOWS/System32/bash
/c/msys64/usr/bin/bash
kvalkhof75 commented 4 years ago

Just in case path order rings a bell:

PATH=
C:\ProgramData\DockerDesktop\version-bin
C:\Program Files\Docker\Docker\Resources\bin
C:\WINDOWS
C:\WINDOWS\System32
C:\WINDOWS\System32\Wbem
C:\WINDOWS\System32\WindowsPowerShell\v1.0\
C:\Python27\Scripts
C:\Program Files\Python37\Scripts
C:\Program Files\Python37
C:\Program Files\PuTTY\
C:\msys64\usr\bin
C:\Program Files\TortoiseSVN\bin
C:\Program Files\nodejs\
C:\Program Files\gitlab-runner
C:\Program Files\AdoptOpenJDK\jdk-11.0.3.7-openj9\bin
C:\Program Files\TortoiseGit\bin
C:\Program Files\Git\cmd
C:\Users\kvalkhof\AppData\Local\Microsoft\WindowsApps
C:\Users\kvalkhof\.dotnet\tools
C:\Users\kvalkhof\AppData\Roaming\npm
asottile commented 4 years ago

hmmm yeah bash being earlier on the PATH due to system32 is probably going to be a problem -- does it work if you put the msys directory earlier?

mine looks like this (which I guess would put the WSL bash first too if I had it installed)

C:\Users\Anthony>echo %PATH% | sed "s/;/\n/g"
C:\Python37\Scripts\
C:\Python37\
C:\WINDOWS\system32
C:\WINDOWS
C:\WINDOWS\System32\Wbem
C:\WINDOWS\System32\WindowsPowerShell\v1.0\
C:\WINDOWS\System32\OpenSSH\
C:\Program Files\Git\cmd
C:\Program Files\Git\mingw64\bin
C:\Program Files\Git\usr\bin
C:\Program Files (x86)\Windows Live\Shared
C:\Users\Anthony\.cargo\bin
C:\Users\Anthony\AppData\Local\Microsoft\WindowsApps
kvalkhof75 commented 4 years ago

I can confirm that pre-commit correctly executes the bash_script.sh when MSys2 is the first in the path. I will investigate this further and report the difference between MSys2 and WSL as soon as I have some time for it. Is this still the correct thread to do so? Thanks for the hint, I would not have tried this due to the result of C:\Workspace>c:\msys64\usr\bin\bash.exe -c C:\Users\kvalkhof\.cache\pre-commit\repoqvypm_m7\hooks\shfmt.sh.

asottile commented 4 years ago

probably not, I'm not sure where the correct place to report that is, seems like a WSL bug to me

kvalkhof75 commented 4 years ago

Here are my findings regarding the bash hook on Windows using WSL bash and MSys2-64 bash (including Git bash).

Note that I used bash -c in my earlier findings to execute a shell script. This failed for both bash-es in the same way as the below failure with WSL bash. And this does not represent how 'pre-commit' uses bash. (And I made some other errors that made me running in the wrong direction :-( , eating up time and providing experience.)

The intended use for MSys2-64 bash is starting Windows tools/executables (installed in Windows) with only a minimal set of POSIX tools available. Windows executables expect Windows paths as parameters, but most likely tolerate POSIX directory separators (see below). POSIX tools work with both Windows and POSIX paths.

The intended use for WSL bash is starting Linux tools/executables (installed in Linux, unavailable for Windows) with a full set of POSIX tools available. All paths shall of course be POSIX compliant.

My original problem was performing a git commit in Windows also using the pre-commit tools shfmt and clang-format. This is resolved by installing these tools in Windows as well, and starting git commit while MSys2-64 bash is first in PATH. (My first resolution was to try to make pre-commit use the tools as installed on WSL.)

This however leaves me with the small issue that I want bash to be WSL bash in the same terminal where I run git commit, which wants MSys2-64 bash.

If we decide to address this. :-) Then I see two options, and there may of course be more.

  1. Add an option like default_language_version where I can state that MSys2-64 bash must be used if found, ignoring the PATH order.
  2. Make parsing she-bangs only generate POSIX paths. This will not break MSys2-64 bash as far as I investigated (see below), and it just sounds logical to me.

Thanks!

bash_script.sh

#!/bin/bash
echo "$0" "$@"

WSL bash first in PATH

C:\Workspace\pre-commit>pre-commit run shfmt --files tests\bash_script.sh
Check shell style with shfmt.............................................Failed
hookid: shfmt

/bin/bash: C:Userskvalkhof.cachepre-commitrepolm4k713rhooksshfmt.sh: No such file or directory
C:\Workspace\pre-commit>C:\Windows\System32\bash.exe C:\Workspace\pre-commit\bash_script.sh Hello C:\Workspace\pre-commit\bash_script.sh
/bin/bash: C:Workspacepre-commitbash_script.sh: No such file or directory

MSys2-64 bash first in PATH

C:\Workspace\pre-commit>pre-commit run shfmt --files tests\bash_script.sh
Check shell style with shfmt.............................................Passed
hookid: shfmt

[RUN] shfmt -i 4 -w -s tests\bash_script.sh
[PASS]

C:\Workspace\pre-commit>pre-commit run shfmt --files tests/bash_script.sh
Check shell style with shfmt.............................................Passed
hookid: shfmt

[RUN] shfmt -i 4 -w -s tests\bash_script.sh
[PASS]
C:\Workspace\precommit>C:\msys64\usr\bin\bash.exe C:\Workspace\pre-commit\bash_script.sh Hello C:\Workspace\pre-commit\bash_script.sh
C:\Workspace\pre-commit\bash_script.sh Hello C:\Workspace\pre-commit\bash_script.sh

C:\Workspace\precommit>C:\msys64\usr\bin\bash.exe /c/Workspace/pre-commit/bash_script.sh Hello C:\Workspace\pre-commit\bash_script.sh
/c/Workspace/pre-commit/bash_script.sh Hello C:\Workspace\pre-commit\bash_script.sh

Windows [Canonicalizing separators]() step during [Path normalization].

This (most likely) means that any relative POSIX path is os independent and will work as expected. All references to files in the repository that pre-commit checks are relative paths.

[path normalization]: https://docs.microsoft.com/en-us/dotnet/standard/io/file-path-formats#path-normalization [canonicalizing separators]: https://docs.microsoft.com/en-us/dotnet/standard/io/file-path-formats#canonicalizing-separators

asottile commented 4 years ago

sorry just now getting back to this -- I don't think we can use posix paths due to multiple drives (a posix path may be correct only on the primary drive)

seems to me like WSL bash should just be avoided on windows whenever possible -- I'm a bit surprised how early it is in PATH (which ~breaks lots of things expecting msys bash)

kvalkhof75 commented 4 years ago

Hello Anthony,

Thanks for replying.

I am configuration manager at Ley where I am responsible for the developer tools as well supporting over 100 developpers. The fact that WSL bash is earlier in the PATH by default caused me to support some novice developpers that did not immediately understood what went wrong. I can't blame them. It is my documentation that became more complex than anticipated due to the PATH issue...

I did get some time to think about the issue. The solution to the drive prefix on Windows is actually not even that hard once you understand that you only need it once, and it is even cachable. Let me explain.

The bash command looks roughly like this:

[P1]\bash.exe [P2]\script.sh [Fixed user parameters: P3] [Files to check: P4]

So only P2 has the drive prefix issue. But this path always starts with the path of the pre-commit cache directory. (Because all tools are installed there.) So if you know the POSIX path of the cache directory then P2 can be generated by a default path join. And for each bash on the system you only need to remember this one prefix. So you can make a dict with the absolute path to bash as key and the prefix to the cache dir as value to chache them. (In the rare and absurd case someone uses more than one bash with pre-commit...)

You can get this prefix by executing:

CD /D [P2] && [P1]/bash -c pwd

Note the /D. (Add --noprofile and --norc ?) Note that the above solution is simpler that the pull request that I accidentally posted.

As an alternative solution, a configuration option could be made where the absolute path to the bash to use could be set. Or an option to state that the PATH sould be ignored and the bash shipped with git should be used which is MSys2.

And I am willing to do the work...