Open dscho opened 4 weeks ago
Thanks for the tip on running under sh -x /mingw64/bin/git-update-git-for-windows -y
!
My bash
-fu is absolute garbage, but here's what I get:
+ start '' /tmp/gfw-install-Uk97SUyn.exe //SILENT //NORESTART
+ ps
+ grep ' /usr/bin/bash$'
+ awk '{print "kill -9 " $1 ";" }'
+ sh
sh: line 1: kill: (532) - No such process
In another run it seems like there was a different issue and maybe that the gfw-install-td22JHkt.exe
failed:
+ start '' /tmp/gfw-install-td22JHkt.exe //SILENT //NORESTART
+ ps
+ grep ' /usr/bin/bash$'
+ awk '{print "kill -9 " $1 ";" }'
+ sh
##[error]Process completed with exit code 1.
Progress! Your screenshot suggestion helped greatly! It seems like the installer is succeeding after some time judging from the screenshots and the output of git --version
.
I still don't understand why git update-git-for-windows -y
is exiting with code 1 nor why that step is ending before the installer has run to completion.
https://github.com/rotu/workflow-scratch/actions/runs/11524785608/job/32085698468
I still don't understand why
git update-git-for-windows -y
is exiting with code 1 nor why that step is ending before the installer has run to completion.
The big problem with git update-git-for-windows
is that it is implemented as a Unix shell script, yet its purpose is to upgrade, potentially including the MSYS2 runtime (that is in use while running the shell script and therefore cannot be overwritten). As a work-around, the script starts the installer and immediately exits the script, potentially forcefully stopping other Bash instances (after asking the user about it).
Obviously, this is not what we would want in a GitHub workflow: We would want the workflow step to run the installer and wait for its exit status, and avoid stopping any Bash instances.
Sadly, I don't think that there is any way to keep Bash running the script while overwriting the MSYS2 runtime and then having a prayer of the script exiting in an orderly fashion. The most common symptom I saw in the past was an infinitely-repeated error message.
After thinking about this a bit more, I think that the only solution to this puzzle would be to rewrite the command in pure C, linking to libcurl to download the installer manually (and using the presence of /etc/install-options.txt
as a tell-tale whether updating via the installer is applicable in the first place, excluding e.g. PortableGit-based installations).
However, this would be a rather big task.
A much more narrowly-scoped alternative (and therefore much more practical) would be to add some code guarded by if test -n "$GITHUB_ACTIONS"
that verifies that it is not run from a Bash step, and then instead of start
ing the installer, exec
s it.
Okay, I kinda understand that. It does look like a shell script, with a #!/bin/sh
shebang. BUT then there's the start
line:
curl -# -L -o $installer $download || return
start "" "$installer" //SILENT //NORESTART
# Kill all Bash processes (which will let MinTTY quit, too)"
#
# `ps` without `-W` will automatically only catch MSYS2 processes
# that link to *this* MSYS2 runtime, i.e. no processes from other Git
# installations (e.g. Git for Windows' SDK) will be killed.
ps | grep ' /usr/bin/bash$' | awk '{print "kill -9 " $1 ";" }' | sh
return 2
That looks like the cmd.exe
builtin START
, and I'm guessing it's made to emulate it.
Anyhow, in my case, running which start
, I see that it's a batch script at /usr/bin/start
which looks like below (so I don't think it will work if no MSYS2 instance isn't in the $PATH
). That said, if the idea is to spin off the executable, why wrap it in start
and not just use nohup
?[^1]
#!/usr/bin/env bash
# Copyright (C) 2014, Alexey Pavlov
# mailto:alexpux@gmail.com
# This file is part of Minimal SYStem version 2.
# https://sourceforge.net/p/msys2/wiki/MSYS2%20installation/
# File: start
"$COMSPEC" //c start "${@//&/^&}"
[^1]: EDIT: it seems like the lifetime of a command started with nohup
in MSYS2 is scoped to the parent window. That is, if I run the below command in a PowerShell instance, the nohup
'd command outlives git's bash.exe
, but trying to exit
PowerShell will hang, and killing PowerShell will kill the inner command.
& 'C:\Program Files\Git\bin\bash.exe' -c 'nohup /c/WINDOWS/system32/bash -c "while true; do echo -n .; sleep 1; done" &'
Obviously, this is not what we would want in a GitHub workflow: We would want the workflow step to run the installer and wait for its exit status, and avoid stopping any Bash instances.
I'm not sure if you realize that bash
in GitHub actions on Windows is the one shipped with Git for Windows so it's likely to have the same problem. When running on my Windows 11 desktop, however, my bash
points to the one in wsl
, which I think is pretty typical. I'm not sure how the shebangs and start
script are expected to resolve if you run the update script from a shell other than the one distributed with Git for Windows.
that verifies that it is not run from a Bash step, and then instead of
start
ing the installer,exec
s it.
What does exec
do in the bash
shipped with Git for Windows? Does it still hold on to the MSYS2 runtime?
It does look like a shell script, with a
#!/bin/sh
shebang.
It is a shell script.
there's the
start
line
Indeed. This starts the installer, but immediately returns to the shell script. The installer continues running, and the design of the script is that it exits before the installer comes even close to (potentially) overwriting the MSYS2 runtime.
That looks like the
cmd.exe
builtinSTART
, and I'm guessing it's made to emulate it.
Right, it's a look-alike, living in /usr/bin/start
.
I'm not sure if you realize that
bash
in GitHub actions on Windows is the one shipped with Git for Windows
Oh, I am well aware of it, and I tried to fight it, and gave up. I did not want the burden of having to maintain Git for Windows and in addition the Bash that runs all of the shell: bash
steps in GitHub Actions jobs on Windows runners.
I'm not sure how the shebangs and
start
script are expected to resolve if you run the update script from a shell other than the one distributed with Git for Windows.
The shell that runs git update-git-for-windows
is the /usr/bin/sh
one because git.exe
prepends /usr/bin
to the PATH
before resolving and running the git-update-git-for-windows
script.
What does
exec
do in thebash
shipped with Git for Windows?
exec
does the same as the exec()
syscall on Linux: It replaces the current process with a new one, in this case the installer, effectively exiting the current shell script but still having callers wait for the exit code, in this instance the exit code of the exec
ed process, though.
Exactly what we need here.
It is a shell script.
I think I'm still confused by why the script kills /usr/bin/bash
but not just all MSYS processes besides the upgrade script and its parents. /usr/bin/sh
is even binary-identical on my machine!
Oh, I am well aware of it, and I tried to fight it, and gave up. I did not want the burden of having to maintain Git for Windows and in addition the Bash that runs all of the
shell: bash
steps in GitHub Actions jobs on Windows runners.
I feel you, man...
It replaces the current process with a new one, in this case the installer.
I don't get is whether any OS resources, file handles, or MSYS2 stuff depend on the process exiting in order to clean up. Or if it's going to be left in a state that conflicts with the installer.
It seems like, if we used exec
in the shell script, the installer would pull the rug out from under the git.exe
executable. If I'm reading the call to waitpid
in try_shell_exec
correctly, the git process would wait for the process to exit (regardless of whether that process is running the shell script or the installer has taken over).
After composing the previous message, I realized that Windows implicitly scopes some things to the calling window and not the process. I don't understand how that interplays with exec
.
Per @vorporeal at https://github.com/actions/runner-images/issues/10843#issuecomment-2460055938
For folks running into this issue, we've found the following step to be an easy and reliable way to get git-for-windows updated to the latest version:
- name: Update to latest git-for-windows run: | Start-Process git -Wait -ArgumentList 'update-git-for-windows -y'
Interesting. @dscho any insights? Does git
needs to be started in a new window for update-git-for-windows
to behave as expected?
Start-Process git -Wait -ArgumentList 'update-git-for-windows -y'
Does
git
needs to be started in a new window forupdate-git-for-windows
to behave as expected?
No, that's not what works around the issue here. What does work around the issue here is that Start-Process
returns a Process
object, even if it was called with -Wait
, and does not propagate the exit code of the started process as return value. Instead, the exit code is made available as a property of the Process
object. If the caller would inspect that property and error out if it is non-zero, you would get the exact same behavior as if you had called git update-git-for-windows -y
directly.
The only downside with this is that you will not catch any problems when the installer could not be downloaded, or when it could not successfully run to completion.
It is a shell script.
I think I'm still confused by why the script kills
/usr/bin/bash
but not just all MSYS processes besides the upgrade script and its parents./usr/bin/sh
is even binary-identical on my machine!
@rotu The script needs to kill its parent process. Otherwise its parent process won't be able to exit correctly if msys-2.0.dll
is overwritten with a new version by the installer. The script tries to ensure that there are no other MSYS processes running, but it cannot require that its parent process isn't running ;-)
It replaces the current process with a new one, in this case the installer.
I don't get is whether any OS resources, file handles, or MSYS2 stuff depend on the process exiting in order to clean up. Or if it's going to be left in a state that conflicts with the installer.
Unlike Linux, on Windows it is not possible to overwrite files that are in use. To emulate that behavior, Cygwin's runtime (and therefore also the MSYS2 runtime, which is a friendly fork of the Cygwin one) plays nasty tricks. This is the reason why the installer can overwrite msys-2.0.dll
even though a Bash is running the update-git-for-windows
script.
However, this emulation is far from perfect. For example, Cygwin cannot prevent Windows' now-broken assumption of an unchanged msys-2.0.dll
(which is memory-mapped), which is the reason why the parent process would crash in nasty ways when trying to exit cleanly and tear down its resources if the msys-2.0.dll
file was switched out under its feet. The only way I found to make that work (kind of, at least) was to use the kill
command to terminate the parent process. Which will result in a non-zero exit code, but in interactive use this does not matter because the Git Bash window (or "Run command" window) is then gone anyway, so the user won't see it.
It was reported by @rotu in https://github.com/git-for-windows/git/issues/5199#issuecomment-2435933384 that it does not work:
It could be a modal dialog? If so, it should be possible to capture it with something like this PowerScript:
Another idea would be to run the
git-update-git-for-windows
script with tracing turned on:or to dig into it by debugging with tmate:
However, my money is on this part of the script which tries to ensure that the (interactive) Bash session is quit forcefully lest an updated MSYS2 runtime prevent an orderly exit.
Now, if that last hunch is correct, I could imagine that we need to add a special case for when
$GITHUB_ACTIONS
is non-empty and simply skip that code block in that scenario.@rotu, would you like to give this a try?