git-for-windows / git

A fork of Git containing Windows-specific patches.
http://gitforwindows.org/
Other
8.19k stars 2.49k forks source link

Portable Git fails to use Windows ssh-agent Service if OpenSSH is installed in Program Files #4960

Open drandarov-io opened 1 month ago

drandarov-io commented 1 month ago

Setup

$ git --version --build-options
git version 2.45.1.windows.1
cpu: x86_64
built from commit: 965b16798dab6962ada5b0d8cf0dca68f385c448
sizeof-long: 4
sizeof-size_t: 8
shell-path: /bin/sh
feature: fsmonitor--daemon
$ cmd.exe /c ver
Microsoft Windows [Version 10.0.22635.3640]

Windows 11 Pro for Workstations 23H2 22635.3640

# One of the following:
> type "C:\Program Files\Git\etc\install-options.txt"
> type "C:\Program Files (x86)\Git\etc\install-options.txt"
> type "%USERPROFILE%\AppData\Local\Programs\Git\etc\install-options.txt"
> type "$env:USERPROFILE\AppData\Local\Programs\Git\etc\install-options.txt"
$ cat /etc/install-options.txt
Not present in Portable Git
  • I have OpenSSH installed unter C:\Program Files\OpenSSH\
  • The ssh-agent work without a problem in the terminal (e.g. ssh-add -l)

Details

Windows Terminal, Powershell 7.4.2

  1. Have user.signingkey set in .gitconfig
  2. user.singingkey is present when running ssh-add -l
  3. core.sshCommand is set to C:/Program\\ Files/OpenSSH/ssh.exe
  4. Try to run any git commit
  • Git should use the keys present in C:\Program Files\OpenSSH\ssh-agent.exe and create a signed commit
  • Git (likely) used the keys (that aren't present) in git/usr/bin/ssh-agent.exe instead
  • Command fails with windows error: Couldn't get agent socket?

-

  • Deleting the usr directory in the Git install location works around the problem and forces Git to use the correct agent
  • Using MinGit instead of Git also works around the problem even though the usr directory is present and unchanged. I don't know why this would work.
  • Setting core.sshCommand is not even needed as long as one of the two workarounds above are used.
  • I would assume that whatever happens, when you choose to not use the Git-included OpenSSH distribution in the non-portable setup, would need to be done in this case as well.
  • It took me many hours to find these workarounds. This shouldn't be the case. I believe this is either a bug or undocumented edge case.
dscho commented 1 month ago

Using the external OpenSSH is indeed not supported with Portable Git. In the Git for Windows installer, we can get away with deleting the shipped OpenSSH files, but that DeleteOpenSSHFiles() function is defined dynamically, in a non-trivial way.

Having said that, I could imagine that overriding the PATH in core.sshCommand might work:

[core]
  sshCommand = "PATH=\\\"C:/Program Files/OpenSSH:$PATH\\\" ssh.exe"

Rationale: The problem with ssh-agent is that the variant shipped with the Portable Git is found in the PATH before the Windows variant. This is because Git for Windows, for various reasons, has to edit the PATH so that programs like sh.exe are found. Since core.sshCommand can be a full command-line (and not only a path to an executable), we can use a command-line that adjusts PATH specifically for the ssh.exe invocation.

Please let us know whether that works for you.

drandarov-io commented 1 month ago

Thanks for the answer and pointing me to the code that addresses this topic. I was searching for bundled in this repo instead of the "build-extra" repo, so that explains why I got stuck with trying to understand this myself. 😅

I believe this also explains why the build-in path for ssh.exe in Windows 11 being C:\Windows\system32 usually doesn't cause an issue since it's likely treated differently in the PATH order.

Even with sshCommand = "PATH=\\\"C:/Program Files/OpenSSH:$PATH\\\" ssh.exe" the issue sadly remains.

$ git config --global core.sshCommand
16:28:09.780351 exec-cmd.c:243          trace: resolved executable dir: X:/Tools/git/mingw64/bin                                                                                                                                                                                                           
16:28:09.782358 git.c:465               trace: built-in: git config --global core.sshCommand                                                                                                                                                                                                               
PATH=\"C:/Program Files/OpenSSH:$PATH\" ssh.exe                                                                                                                                                                                                                                                            
$ git commit -m "test commit"  --
16:28:12.863081 exec-cmd.c:243          trace: resolved executable dir: X:/Tools/git/mingw64/bin
16:28:12.864083 git.c:465               trace: built-in: git commit -m 'test commit' --
16:28:12.869086 run-command.c:657       trace: run_command: ssh-keygen -Y sign -n git -f 'C:\Users\drand\AppData\Local\Temp/.git_signing_key_tmpqgwq6W' -U 'C:\Users\drand\AppData\Local\Temp/.git_signing_buffer_tmp50OF2j'
error: Couldn't get agent socket?

fatal: failed to write commit object

I think it's fair if this is not a priority issue for you and therefore not planned. Maybe a note in the README.portable about the equivalent DELETE_OPENSSH_FILES action might help other people in the future as an alternative if that's possible.

It's interesting however that MinGit doesn't have the same issue even though the OpenSSH files are indeed present under /usr/bin. That seems weird since the non-portable Git-Setup's fix is deleting these files, but MinGit uses the correct Agent nevertheless without deleting the files.

dscho commented 1 month ago

Even with sshCommand = "PATH=\\\"C:/Program Files/OpenSSH:$PATH\\\" ssh.exe" the issue sadly remains.

Hmm. I wonder what is going wrong there. Could you maybe verify with Process Monitor that the correct ssh-keygen is picked up?

On second thought, since ssh-keygen is mentioned: That's not ssh. So maybe it isn't core.sshCommand that needs to be configured, but gpg.ssh.program?

It's interesting however that MinGit doesn't have the same issue even though the OpenSSH files are indeed present under /usr/bin.

The reason is that ssh-keygen.exe is excluded from MinGit. It's not present in /usr/bin there.

drandarov-io commented 1 month ago

On second thought, since ssh-keygen is mentioned: That's not ssh. So maybe it isn't core.sshCommand that needs to be configured, but gpg.ssh.program?

You are indeed correct. I've never seen this configuration even mentioned anywhere. Setting it to the correct ssh-keygen.exe fixed the issue. 😊

[gpg "ssh"]
    program = C:/Program\\ Files/OpenSSH/ssh-keygen.exe

The reason is that ssh-keygen.exe is excluded from MinGit. It's not present in /usr/bin there.

You are right! I saw ssh-agent.exe and didn't even think to check for ssh-keygen.exe.


On a side-note: Interestingly, but not really important: core.sshCommand requires a \\ before the space in Files to function, while gpg.ssh.program requires it to be absent:

[gpg "ssh"]
    program = C:/Program Files/OpenSSH/ssh-keygen.exe
                           ^^^
[core]
    sshCommand = C:/Program\\ Files/OpenSSH/ssh.exe
                               ^^^

Even setting these in (Double-)Quotes doesn't change this fact.

The respective error messages, if this is not correctly set, are:

# Erroneously present \\ in ssh-keygen path
error: cannot spawn C:/Program\ Files/OpenSSH/ssh-keygen.exe: No such file or directory

and

# Erroneously absent \\ in ssh path
C:/Program Files/OpenSSH/ssh.exe: line 1: C:/Program: No such file or directory

Do you think adding a write-up of this information would be useful for the README.portable. In theory I could write a paragraph if there is interest. I'm not sure how such changes are looked upon or whether - if there were to be any interest in changes like that - you'd prefer to do it yourself.

dscho commented 1 month ago

On second thought, since ssh-keygen is mentioned: That's not ssh. So maybe it isn't core.sshCommand that needs to be configured, but gpg.ssh.program?

You are indeed correct. I've never seen this configuration even mentioned anywhere. Setting it to the correct ssh-keygen.exe fixed the issue. 😊

Yay!

Interestingly, but not really important: core.sshCommand requires a \\ before the space in Files to function, while gpg.ssh.program requires it to be absent

This is one of Git's many footguns, unfortunately.

The sshCommand is supposed to be a (partial) command-line, allowing users to pass additional parameters to the program. As such, it needs paths containing spaces to be quoted in order not to mistake the latter part as a command-line option.

Whereas the program is not a command-line, but the full path to the executable. If you quote it or add backslashes to escape characters, it won't work because those quotes and backslashes will be mistaken as being part of the absolute path.

Do you think adding a write-up of this information would be useful for the README.portable.

That would be quite awesome! Would you give it a try?

I would even be in favor of an additional PR to change Git for Windows' installer not to delete the OpenSSH files but instead to use the PATH trick in core.sshCommand and to configure gpg.ssh.program in addition.

drandarov-io commented 1 month ago

That would be quite awesome! Would you give it a try?

I've been quite busy the last couple days, but I'll give it a try in the following days.

I would even be in favor of an additional PR to change Git for Windows' installer not to delete the OpenSSH files but instead to use the PATH trick in core.sshCommand and to configure gpg.ssh.program in addition.

I'll try that as well, but I'll need to see if I'll have any issues getting used to the setup process/scripts.

dscho commented 1 month ago

I'll give it a try in the following days.

I'm looking forward to see your PR(s).

Okeanos commented 1 month ago

Just stumbled across this issue and wanted to chime in here.

I would even be in favor of an additional PR to change Git for Windows' installer not to delete the OpenSSH files but instead to use the PATH trick in core.sshCommand and to configure gpg.ssh.program in addition.

From what I still remember during implementation and in discussions with the maintainers, I specifically opted for the removal of OpenSSH instead of changing e.g. core.sshCommand (GIT_SSH_COMMAND). Why? Because it gives a consistent user experience across the whole Bash/Shell/Terminal regardless of which application a user engages with. Specifically, overriding the Git setting alone would only ensure that Git itself uses the correct SSH binaries – every other shell application is likely to do something different, i.e. use what is found first on the path.

There is, however, another workaround involving the path that isn't quite so messy if I recall correctly. Setting the following PATH override in the shell via .bashrc or .bash_profile (whatever applies):

export PATH="/c/Windows/System32/OpenSSH:${PATH}"

This will override all of the Git shell magic by ensuring Windows SSH (or any other SSH implementation) is found first on the path. Also, easily/dynamically reversible without having to reinstall Git. I don't think (without having tried it) it is possible to handle this via Windows (User) Environment settings. To the best of my memory/knowledge, Git for Windows does a few things effectively at runtime depending on how deeply integrated you want Git Bash binaries to be with Windows and will always prepend some of its binaries to the PATH as served from Windows via Windows (User) Environment.

dscho commented 1 month ago

I specifically opted for the removal of OpenSSH instead of changing e.g. core.sshCommand (GIT_SSH_COMMAND). Why? Because it gives a consistent user experience across the whole Bash/Shell/Terminal regardless of which application a user engages with.

I would agree that it is consistent across all of Git for Windows if it weren't for Portable Git.

export PATH="/c/Windows/System32/OpenSSH:${PATH}"

This will override all of the Git shell magic by ensuring Windows SSH (or any other SSH implementation) is found first on the path.

That misses git.exe's own PATH-prepending logic.

Okeanos commented 1 month ago

This will override all of the Git shell magic by ensuring Windows SSH (or any other SSH implementation) is found first on the path.

That misses git.exe's own PATH-prepending logic.

I could swear that when using the export I mentioned from within Bash it does work as expected. Not using Windows too much these days but can give it a try later on and report back. What specifically would I have to test in order to verify?

The interesting so to say statement in your link is the call to append_system_bin_dirs, right? If it does indeed do as it says it does (append and not prepend) … then shouldn't all be fine still? This just ensures that somewhere on the path, i.e. at the end, the expected system binaries can be found, regardless of what happened previously. Because my export prepends the path with the SSH binary location it'll come first and be used regardless of whether there's a second one later on?

dscho commented 1 month ago

This will override all of the Git shell magic by ensuring Windows SSH (or any other SSH implementation) is found first on the path.

That misses git.exe's own PATH-prepending logic.

I could swear that when using the export I mentioned from within Bash it does work as expected.

The crucial part is from within Bash. If you start git.exe from, say, PowerShell, where MSYSTEM is not set, this logic kicks in.

dscho commented 1 month ago

BTW there is a very good reason why deleting files is not the ideal solution: it makes it unnecessarily hard to switch back and forth between SSH variants (if only to verify when things fail using one variant that they also fail using the other variant).

Okeanos commented 1 month ago

This will override all of the Git shell magic by ensuring Windows SSH (or any other SSH implementation) is found first on the path.

That misses git.exe's own PATH-prepending logic.

I could swear that when using the export I mentioned from within Bash it does work as expected.

The crucial part is from within Bash. If you start git.exe from, say, PowerShell, where MSYSTEM is not set, this logic kicks in.

Ah right … yes. That is indeed a problem. I also concede that deleting the files is not nice (for the stated reasons) and would dearly love a better solution. That solution should, however, work for the whole shell instead of just Git.

dscho commented 1 month ago

That solution should, however, work for the whole shell instead of just Git.

That will need two separate configuration changes, then, the Git config change as well as an /etc/profile.d/ change. Hopefully we can keep this simple enough that it can be described concisely in Portable Git's README.