Open macdjord opened 2 years ago
I suspect that you'll most likely get the same behavior in Cygwin when using a build right off of the cygwin-3_3-branch
. It's too sad that https://github.com/cygwin/cygwin/actions/runs/2293880675/workflow does not contain a step to upload the artifacts, otherwise you could test much more conveniently.
FWIW it should be as easy as adding something like this at the end of the workflow:
- uses: actions/upload-artifact@v2
with:
name: install
path: install
@dscho I have nothing to do with Cygwin and have no control over their workflow.
Also, I have discovered something that's probably related: if I use CTRL-C inside vagrant ssh
, it immediately exits the session.
Since Git for Windows relies on the MSYS2 Runtime for its Bash, and the MSYS2 Runtime is a close derivative of the Cygwin runtime, you have quite a bit to do with Cygwin, @macdjord, in fact, I strongly suspect the root issue to lie within the Cygwin runtime's source code. And if that is true, the bug fix will need to be implemented in Cygwin, too, and it is quite probable that some Cygwin folks would be helpful in this regard if we could demonstrate that Cygwin itself (without the MSYS2 patches on top) already displays the bug. Hence my suggestion to build something based on the cygwin-3_3-branch
.
I'm fairly certain the point @macdjord was making is that they're not a Cygwin maintainer.
@dscho: @Pyker has it. I'm not a Cygwin maintainer or contributor, and have no idea how to build anything based on a specific Cygwin branch.
Issue persists in git version 2.37.1.windows.1
. Guess its back to 2.34.1 for me again.
I suspect that you'll most likely get the same behavior in Cygwin
On Cygwin:
[local] $ ssh server
...
[server] $ ^C
[server] $ ^D
[local] $ echo $?
130
[local] $ ssh server
...
[server] $ ^D
[local] $ echo $?
0
so this seems to reproduce in Cygwin (using the Cygwin ssh updated today, nothing built from source).
It's too sad that [Cygwin's CI] workflow does not contain a step to upload the artifacts, otherwise you could test much more conveniently
You can usually ask on the mailing list for a new snapshot; you may need to change the directions for MSYS2/Git for Windows Bash.
FWIW it should be as easy as adding something like this at the end of the workflow
dscho I have nothing to do with Cygwin and have no control over their workflow.
The workflow change suggestion seems to be directed more at some hypothetical enterprising person who wants to make downstream maintainers' lives easier than at any particular person on this thread, but that seems to have been unclear enough to confuse people.
I strongly suspect the root issue to lie within the Cygwin runtime's source code. And if that is true, the bug fix will need to be implemented in Cygwin, too, and it is quite probable that some Cygwin folks would be helpful in this regard if we could demonstrate that Cygwin itself (without the MSYS2 patches on top) already displays the bug.
not sure what you want to do next
It's too sad that [Cygwin's CI] workflow does not contain a step to upload the artifacts, otherwise you could test much more conveniently
You can usually ask on the mailing list for a new snapshot; you may need to change the directions for MSYS2/Git for Windows Bash.
Right, or you simply fork https://github.com/cygwin/cygwin/, create a branch, edit .github/workflows/cygwin.yml
to append a step at the end to upload the build output, i.e. something like this:
- uses: actions/upload-artifacts@v3
with:
name: build
path: build
then push, wait for the workflow to run and finish, then download the build.zip
artifact from that run, then test.
FWIW it should be as easy as adding something like this at the end of the workflow
dscho I have nothing to do with Cygwin and have no control over their workflow.
The workflow change suggestion seems to be directed more at some hypothetical enterprising person who wants to make downstream maintainers' lives easier than at any particular person on this thread, but that seems to have been unclear enough to confuse people.
I am sorry that I wasn't clearer. Maybe you can help with communicating in the Git for Windows project? I would really, honestly love that.
I strongly suspect the root issue to lie within the Cygwin runtime's source code. And if that is true, the bug fix will need to be implemented in Cygwin, too, and it is quite probable that some Cygwin folks would be helpful in this regard if we could demonstrate that Cygwin itself (without the MSYS2 patches on top) already displays the bug.
not sure what you want to do next
The best way to go forward, now that it is confirmed to be a bug in Cygwin that reproduces there, would probably be to send that bug report to the Cygwin project. They seem to suggest to send a mail with a detailed bug report to the Cygwin mailing list.
Following suggestions in msys2/msys2-runtime#83, I tried with CYGWIN=disable_pcon
and CYGWIN=enable_pcon
, both had the same problem. I tried rolling back Cygwin to 3.3.4-2, then 3.3.3-1 (from 3.3.5-1), and openssh to 8.9p1 from 9.0p1. All had the same problem. I then tried
[local] $ ssh server
...
[server] $ ^C
[server] $ true
[server] $ ^D
[local] $ echo $?
0
so I suspect the Cygwin behavior, at least, is intentional on the part of Cygwin and/or SSH. Does the same thing happen in Git for Windows bash if a command completes successfully after the keyboard interrupt (both with and without the wrapper script)?
I'm kind of curious why the Python wrapper terminates with a KeyboardInterrupt, since ssh already handled that signal; perhaps it forgot to clear it after processing?
FWIW I just tried to ssh
into one of my machines (not through any Python script), and Ctrl+C'ed a long-running git log
command, then pressed Ctrl+C again, quit the pager, and pressed Ctrl+C again, and none of those key presses terminated the SSH session.
And then I had this idea that the Python script in question probably runs through a non-MSYS Python (because Git for Windows does not ship with a Python interpreter), therefore the MSYS2 runtime handles the Ctrl+C press by sending a ConsoleCtrlEvent to Python. If that somehow fails, the MSYS2 runtime will terminate the process in more forceful ways.
To investigate this deeply, a lot of time and patience will be required, as well as the willingness to learn about the Cygwin/MSYS2 runtime's internals and how to build and modify it. I am willing to guide any volunteer through this, but I will not distract myself away from more impactful projects to perform that investigation myself.
FWIW I just tried to
ssh
into one of my machines (not through any Python script), and Ctrl+C'ed a long-runninggit log
command, then pressed Ctrl+C again, quit the pager, and pressed Ctrl+C again, and none of those key presses terminated the SSH session.
Makes sense, and about what I (and the original reporter, it seems) would expect.
And then I had this idea that the Python script in question probably runs through a non-MSYS Python (because Git for Windows does not ship with a Python interpreter), therefore the MSYS2 runtime handles the Ctrl+C press by sending a ConsoleCtrlEvent to Python. If that somehow fails, the MSYS2 runtime will terminate the process in more forceful ways.
The Ctrl-C event seems to get to the ssh executable, since it's showing up as ^C
in the original transcript, but the MSYS2 runtime doesn't seem to have used the forceful methods to terminate the process, since the example session in the original report logs out normally (the KeyboardInterrupt
pops up after the logout).
To investigate this deeply, a lot of time and patience will be required, as well as the willingness to learn about the Cygwin/MSYS2 runtime's internals and how to build and modify it. I am willing to guide any volunteer through this, but I will not distract myself away from more impactful projects to perform that investigation myself.
I'd like to narrow it down a bit before saying it's definitely in MSYS internals; if the original reporter could report whether ssh sessions with and without the wrapper terminate abnormally if they press Ctrl+C in the ssh session to generate an interrupt, then execute a successful command (true
is simple) before logging out of the ssh session, that would narrow down whether the problem is in ssh or python plus the wrapper script (and how either of those interact with MSYS2).
I just checked with Anaconda python and Cygwin ssh
and sleep
; Ctrl-C behaved as expected for the subprocess and also generated a KeyboardInterrupt
exception in python (even if I execute true
between pressing Ctrl-C and ending the session). I then tried this from python on one linux machine ssh
ing into another, which did not produce KeyboardInterrupt
exceptions; Cygwin python produced a similar result, as did Anaconda python with Windows's built-in ssh.
@macdjord, would it be possible to wrap the ret = subprocess.call(cmd)
line in something like this:
try:
ret = subprocess.call(cmd)
except KeyboardInterrupt:
if sys.platform != "win32":
raise
That should at least avoid the tracebacks until someone chases down what happens with signals/keyboard events in non-MSYS/non-Cygwin programs that call MSYS/Cygwin programs. Depending on how general the script wants to be, you may want to make the condition specific to Windows python calling MSYS/Cygwin (including git-for-windows) executables.
I'd like to narrow it down a bit before saying it's definitely in MSYS internals; if the original reporter could report whether ssh sessions with and without the wrapper terminate abnormally if they press Ctrl+C in the ssh session to generate an interrupt, then execute a successful command (
true
is simple) before logging out of the ssh session, that would narrow down whether the problem is in ssh or python plus the wrapper script (and how either of those interact with MSYS2).
I tested it both with and without the wrapper in 2.34.1. If I execute a successful command after the Ctrl-C, then the exit code is 0. I can't test it with later versions right now, since I don't have any installed.
I figure I might as well leave my experience here as it may be helpful in diagnosing this issue.
I use vagrant in cygwin to bring up and manage several vms.
For those unfamiliar you can run vagrant up
to provision a vm from a vagrant config file, then run vagrant ssh
to ssh into that system. The vagrant ssh
command does some magic behind the scenes to properly log you into the vm.
After installing the latest git for windows (2.34.1) I ran into the problem described here (Unfortunately I can't confirm if it worked properly before installing git for windows).
However, I found a work around. Vagrant has a way to output the magic it uses to ssh into the vm info a file.
If, from cygwin, I run ssh -F ssh-config-file
I do not run into this issue.
I believe vagrant runs in ruby. So I figure by running ssh on the command line I am avoiding running it from a ruby interpreter and possibly avoiding this issue.
On a fresh msys2 install (openssh 8.9p1-3, msys2-runtime 3.3.5-3) this problem shows up when using a ProxyCommand and the actual proxy command is a plain Windows (non-msys) program. With a direct ssh into the same box CTRL-C works as expected.
this problem shows up when using a ProxyCommand and the actual proxy command is a plain Windows (non-msys) program.
Out of curiosity: do you know whether this Windows program has set a Console Ctrl handler (via SetConsoleCtrlHandler()
)?
I don't know. But it is a node.js script (wscat
, installed via npm install --global wscat2
). A quick search shows that libuv is used. And that seems to set a handler and maps CTRL-C to SIGHUP internally.
I don't know. But it is a node.js script (
wscat
, installed vianpm install --global wscat2
). A quick search shows that libuv is used. And that seems to set a handler and maps CTRL-C to SIGHUP internally.
So... logical next question: does that script register a SIGHUP
handler?
The version of OpenSSH included with git 2.41 reports as: OpenSSH_9.3p1, OpenSSL 1.1.1u 30 May 2023
The version of OpenSSH available from https://github.com/PowerShell/Win32-OpenSSH/releases reports as: OpenSSH_for_Windows_9.2p1, LibreSSL 3.7.2
If I connect to server X with the first client, run top, and issue CTRL-C - I get client_loop: send disconnect: Broken pipe
If I connect to server X with the second client, run top, and issue CTRL-C... it ends top and the connection remains
I am still convinced that the problem is a missing signal handler, but nobody answers my questions here, so...
I would like to add a side note about using ssh ProxyCommand.
The scenario: I'm using git-for-windows as my ssh client to some AWS vms. We are using ssm and my ssh_config is configured to use aws ssm start-session as described here: https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-enable-ssh-connections.html#ssh-connections-enable
Since git-for-windows version 2.35.0 precisely I'm hitting this issue. I tested all (portable) versions until v2.42.0.2 and, for all those versions, trying to interrupt top with ctrl-c kills the aws ssm start-session.
This is working fine until v2.34.1 ...
example with latest v2.42.0.2:
[ec2-user@ec2vm ~]$ top
top - 07:30:32 up 46 min, 3 users, load average: 0.02, 0.01, 0.00
Tasks: 104 total, 1 running, 103 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 960.7 total, 554.2 free, 371.1 used, 179.1 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 589.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1117 root 20 0 1023420 27200 12172 S 6.2 2.8 0:05.64 ssm-session-wor
1 root 20 0 171876 16020 10668 S 0.0 1.6 0:03.33 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
5 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 slub_flushwq
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-events_highpri
10 root 0 -20 0 0 0 I 0.0 0.0 0:00.02 kworker/0:1H-events_highpri
11 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
13 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_kthre
14 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_rude_
15 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_trace
16 root 20 0 0 0 0 S 0.0 0.0 0:00.04 ksoftirqd/0
17 root 20 0 0 0 0 I 0.0 0.0 0:00.08 rcu_preempt
18 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
23 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 inet_frag_wq
24 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kauditd
25 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd
27 root 20 0 0 0 0 S 0.0 0.0 0:00.00 oom_reaper
28 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 writeback
<hitting CTRL+C here>
Command '['session-manager-plugin', '{"SessionId": "censored", "TokenValue": "censored", "StreamUrl": "censored", "ResponseMetadata": {"RequestId": "censored", "HTTPStatusCode": 200, "HTTPHeaders": {"server": "Server", "date": "Wed, 25 Oct 2023 07:30:24 GMT", "content-type": "application/x-amz-json-1.1", "content-length": "1045", "connection": "keep-alive", "x-amzn-requestid": "censored"}, "RetryAttempts": 0}}', 'eu-west-1', 'StartSession', 'default', '{"Target": "censored", "DocumentName": "AWS-StartSSHSession", "Parameters": {"portNumber": ["22"]}}', 'https://ssm.eu-west-1.amazonaws.com']' returned non-zero exit status 3221225786.
client_loop: send disconnect: Broken pipe
censored@WIN10-LAPTOP MINGW64 /a/path
$ <----- back to my git-bash prompt on my laptop !
example with v2.34.1:
[ec2-user@ec2vm ~]$ top
top - 07:41:53 up 57 min, 4 users, load average: 0.05, 0.04, 0.01
Tasks: 105 total, 1 running, 104 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.0 sy, 0.0 ni,100.0 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
MiB Mem : 960.7 total, 520.2 free, 405.0 used, 179.3 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 555.8 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1117 root 20 0 1023420 27200 12172 S 6.7 2.8 0:12.44 ssm-session-wor
1 root 20 0 171876 16020 10668 S 0.0 1.6 0:03.36 systemd
2 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kthreadd
3 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_gp
4 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 rcu_par_gp
5 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 slub_flushwq
6 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 netns
8 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kworker/0:0H-events_highpri
10 root 0 -20 0 0 0 I 0.0 0.0 0:00.03 kworker/0:1H-events_highpri
11 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 mm_percpu_wq
13 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_kthre
14 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_rude_
15 root 20 0 0 0 0 I 0.0 0.0 0:00.00 rcu_tasks_trace
16 root 20 0 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/0
17 root 20 0 0 0 0 I 0.0 0.0 0:00.08 rcu_preempt
18 root rt 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
20 root 20 0 0 0 0 S 0.0 0.0 0:00.00 cpuhp/0
22 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kdevtmpfs
23 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 inet_frag_wq
24 root 20 0 0 0 0 S 0.0 0.0 0:00.03 kauditd
25 root 20 0 0 0 0 S 0.0 0.0 0:00.00 khungtaskd
27 root 20 0 0 0 0 S 0.0 0.0 0:00.00 oom_reaper
28 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 writeback
29 root 20 0 0 0 0 S 0.0 0.0 0:00.02 kcompactd0
30 root 25 5 0 0 0 S 0.0 0.0 0:00.00 ksmd
31 root 39 19 0 0 0 S 0.0 0.0 0:00.02 khugepaged
32 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 cryptd
33 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kintegrityd
34 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kblockd
35 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 blkcg_punt_bio
36 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 tpm_dev_wq
37 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 md
38 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 edac-poller
39 root -51 0 0 0 0 S 0.0 0.0 0:00.00 watchdogd
42 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kswapd0
47 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 kthrotld
51 root 0 -20 0 0 0 I 0.0 0.0 0:00.00 acpi_thermal_pm
52 root 20 0 0 0 0 S 0.0 0.0 0:00.00 xenbus
53 root 20 0 0 0 0 S 0.0 0.0 0:00.00 xenwatch
<hitting CTRL+C here>
[ec2-user@ec2vm ~]$ <----- remain on my sshed VM prompt there !
Not sure if it helps, or how I can help to tackle this, do not hesitate to reach out.
I would like to add a side note about using ssh ProxyCommand.
The scenario: I'm using git-for-windows as my ssh client to some AWS vms. We are using ssm and my ssh_config is configured to use aws ssm start-session as described here: https://docs.aws.amazon.com/systems-manager/latest/userguide/session-manager-getting-started-enable-ssh-connections.html#ssh-connections-enable
Since git-for-windows version 2.35.0 precisely I'm hitting this issue. I tested all (portable) versions until v2.42.0.2 and, for all those versions, trying to interrupt top with ctrl-c kills the aws ssm start-session.
Same use case here. Tested on 2.43.0.windows.1
and it still happens.
Using aws ssm start-session
directly rather than a ProxyCommand
via ssh doesn't have this issue.
I stopped using/recommending the version of ssh included with git due to this issue - see my post from June 5
Setup
No
Details
When running an SSH command, hitting CTRL-C sends SIGINT to the remote process instead of terminating the SSH command, as expected. However, in
2.36.1.windows.1
, if you sent at least one CTRL-C during an SSH session, then, once the SSH session ends but before the SSH command terminates, the SSH command will receive an interrupt:Note:
ssh-wrappy.git/bin/ssh
is a Python wrapper aroundssh.exe
which we use to automatically translate certain server names into their addresses.dev
, however, is not one of those special servers.If CTRL-C is not used during the SSH session, it terminates cleanly instead:
If I bypass the wrapper script, the same thing happens, just less obviously without Python's exception traceback:
CTRL-C sent to SSH should affect only the remote command.
CTRL-C affected both the remote command & the SSH command.
Notes
2.34.1
behaved correctly - CTRL-C affected only the remote command2.34.1.windows.1
and tested this. While the Python script doesn't produce aKeyboardInterrupt
, if I run SSH directly, I still get exit code 130.2.36.0.rc0.windows.1
had a similar but worse bug where CTRL-C during an SSH session would immediately kill the SSH command ( https://github.com/microsoft/terminal/issues/12431 )