Trouble emulating Ctrl-D as EOF

bharatvaj commented 4 weeks ago

I've been trying to replicate some of the UNIX behaviors in windows. clink works great for this out of the box, except I have problem mapping ctrl-d as EOF. When I map it to 'win-insert-eof' such as,

$if clink
"\C-d": win-insert-eof
$endif

it has the following side effects,

Ctrl-D exits console ✅
During line editing, Ctrl-D inserts ^Z instead of deleting the character that's in front of it❌

Ctrl-D doesn't end an input ❌

$ awk "{ print $1 } END { print \"end\" }"
hey
hey
^D # I pressed ctrl-d, and expected a ^Z since I mapped it, but ^D shows up anyway

Is it possible to achieve 2 and 3?

Relevant discussion: https://github.com/rmyorston/busybox-w32/issues/444

chrisant996 commented 4 weeks ago

I think you are expecting to be able to redefine how consoles input works in Windows in general. Clink cannot do that.

Ctrl-D exits console

Refer to Enhanced default settings in the Getting Started section of the documentation, and the clink.ctrld_exits setting.

That's how to control whether Ctrl-D exits from cmd.exe.

During line editing, Ctrl-D inserts ^Z instead of deleting the character that's in front of it

Yes, because you made a key binding that told it to do that. Remove the key binding you added, and instead use the clink.ctrld_exits setting.

Ctrl-D doesn't end an input

It looks like you're entering input into awk. Clink is not involved there, and cannot change how Windows console mode input works in general. Clink can only affect how cmd.exe's own command line input prompt behaves.

chrisant996 commented 4 weeks ago

Also:

On both Linux and Windows, the EOF character is 0x1A, which is ^Z which is Ctrl-Z. This predates the existence of Linux or Windows.
Anywhere that you're able to press Ctrl-D in Linux and have it get interpreted as some kind of "end", such as end shell session or end input or whatever, then there is some code running that actively interprets the Ctrl-D and converts it into something else. That's happening in application code, not in the input subsystem of the OS. ...Well, I'm not positive about it happening in application code on Linux, but certainly on Windows the only place it could be done is in application code, because on Windows ^D means ^D not ^Z.

chrisant996 commented 4 weeks ago

The question make me search for more info on how/why Ctrl-D has the effects it does on Linux, and what Ctrl-D really means on Linux.

This answer explains it very well -- but note that here "the OS" refers to Linux, not Windows, since that was the context of the question that this answered:

In Unix, most objects you can read and write - ordinary files, pipes, terminals, raw disk drives - are all made to resemble files.

A program like cat reads from its standard input like this:
n = read(0, buffer, 512);
which asks for 512 bytes. n is the number of bytes actually read, or -1 if there's an error.

If you did this repeatedly with an ordinary file, you'd get a bunch of 512-byte reads, then a somewhat shorter read at the tail end of the file, then 0 if you tried to read past the end of the file. So, cat will run until n is <= 0.

Reading from a terminal is slightly different. After you type in a line, terminated by the Enter key, read returns just that line.

There are a few special characters you can type. One is Ctrl-D. When you type this, the operating system sends all of the current line that you've typed (but not the Ctrl-D itself) to the program doing the read. And here's the serendipitous thing: if Ctrl-D is the first character on the line, the program is sent a line of length 0 - just like the program would see if it just got to the end of an ordinary file. cat doesn't need to do anything differently, whether it's reading from an ordinary file or a terminal.

Another special character is Ctrl-Z. When you type it, anywhere in a line, the operating system discards whatever you've typed up until that point and sends a SIGTSTP signal to the program, which normally stops (pauses) it and returns control to the shell.

So in your example
$ cat > file.txt
pa bam pshhh<Ctrl+Z>
[2]+  Stopped         cat > file.txt
you typed some characters that were discarded, then cat was stopped without having written anything to its output file.
$ cat > file.txt
pa bam pshhh
<Ctrl+Z>
[2]+  Stopped         cat > file.txt
you typed in one line, which cat read and wrote to its output file, and then the Ctrl-Z stopped cat.

Credit: Mark Plotnick, Jan 6, 2015.

Comments after the answer call out that the answer is accurate for "canonical mode terminals", but that any terminal, even canonical mode terminals, can typically be configured to behave differently, or might even have built-in features that behave differently.

Which fills in the gap that I hadn't been sure about:

I had said:

"Anywhere that you're able to press Ctrl-D in Linux and have it get interpreted as some kind of "end" ... then there is some code running that actively interprets the Ctrl-D and converts it into something else. That's happening in application code, not in the input subsystem of the OS."

Which is mostly true, but partly overly simplified. The Ctrl-D handling is happening in the terminal program. Which is not the input subsystem of the OS, and is an application. But terminal mode programs (aka console mode or text mode programs) receive their input preprocessed by a terminal application. So it's not being handled in the application that you run, e.g. awk, but it is being handled in the terminal application which got automatically implicitly launched when you launched a shell program (or whatever).

In Windows, the legacy terminal program was conhost, which doesn't do anything special with Ctrl-D. But other terminal programs can be used, such as Windows Terminal or ConEmu or Hyper Terminal or ConsoleZ or mintty or etc. Some of them have special behaviors for various key bindings.

On Windows, intercepting Ctrl-D isn't common because Windows console mode programs can read directly from the console input, to handle key combinations that aren't normally represented in preprocessed input funneled from a terminal application. Terminating a program on an empty input isn't a typical operation on Windows, either. Input can be async, and receiving 0 bytes does not always imply "end of file - no more input will come". There is a separate way to accurately check whether end of file on a console input handle has been reached, and typically programs recognize when 0 bytes doesn't actually mean end of input. On Windows, ^Z is typically only treated as the EOF pseudo-character when input is read through the C runtime. cmd.exe (and also Clink) uses direct native Windows I/O APIs for reading input and writing output, so ^Z is interpreted as a literal 0x1A character instead of an EOF pseudocharacter, and they recognize that input has not actually ended, so they don't exit.

bharatvaj commented 4 weeks ago

Thanks for the detailed explanation.

If I understand correctly from your post, when cmd.ctrld_exits is set to true, I assume that ^D is parsed and cmd exits only because clink can read I/O of cmd.exe and when a program like awk launches, it has it's separate I/O and clink can't interfere with it? I really hope some registry hack exists out there for this.

chrisant996 commented 4 weeks ago

I don't know the context about why you want Windows to behave like Linux.

But generally speaking, Windows is going to behave like Windows, not like Linux.

Have you considered using cygwin and bash on Windows? Using Linux tools with something like cygwin might be able to emulate enough Linux behaviors to feel like Linux even while using Windows.

Or have you considered using WSL to actually use Linux on Windows?

I don't know if those are feasible in your case, but I don't think you're going to be able to get Windows to behave like Linux unless you either run either an emulation layer or run actual Linux on Windows.

bharatvaj commented 4 weeks ago

I switch between platforms a lot, and my muscle memory keeps me sane.

While I do use cygwin for compiling programs sometimes, and have used WSL, there is a certain snappiness when using cmd.exe.

clink + busybox-w32 + cmd.exe and a setup script is such a sweet spot that it handles 90% of what I do in a UNIX without costing much in size and speed. Even though I have been hoping to push that 90%, I have learned to choose my battles.

chrisant996 / clink

Trouble emulating Ctrl-D as EOF #660