rmyorston / busybox-w32

WIN32 native port of BusyBox.
https://frippery.org/busybox
Other
697 stars 126 forks source link

third party terminals: "sh" loses window focus #430

Closed avih closed 1 month ago

avih commented 4 months ago

This is a bit weird to explain, and also has some variants, but here's the info I have so far:

On windows 10.0.19045.4529, when launching a 3rd party terminal emulator, like alacritty or wez or rio or few others which are not written in rust (but not with windows terminal), assuming it opens a powershell prompt (because they do, but it's not related to powershell), then running a busybox sh command once is ok, but at the second time the window loses focus, and at the top left of the screen there's a small window, resized to the minimum (iconified?), which has the focus.

E.g. in alacritty, after opening the terminal, and running busybox sh -c echo once: echo1

And then the same command again: echo2

This is the latest busybox.exe release from the homepage from today, but it also happens with earlier builds.

I wasn't able to determine which process it is, but I think it's conhost, though it's hard for me to determine 100% (when manually resizing that window - it's black background with no content, and I don't recognize the icon as any specific application).

If returning the focus manually to the terminal, the terminal doesn't lose focus again in repeated invocations of the same command (and the small window remains).

When exiting the terminal - that extra window closes too.

This also happens if the terminal is configured to launch busybox sh instead of the default powershell. In this case, the first time a new shell is invoked from the busybox sh prompt (e.g. running sh -c echo at the prompt), the terminal loses focus and the small window appears, and similarly repeated further sh commands don't lose focus again, and the small window closes when closing the terminal.

Some of the 3rd party terminals (alacritty, wez) support using a newer conhost from the windows-terminal project (OpenConsole.exe and conpty.dll at the same dir as the 3rd party terminal),

When I do that, then the symptom after the 2nd command is still to lose focus of the main terminal window, but I can't see the small window which presumably has focus.

The issue doesn't happen with other busybox applets, for instance opening the terminal and running busybox echo or busybox echo | busybox grep x multiple times, and it also doesn't happen with non-busybox console apps that I tried (e.g. the "upstream" less.exe).

The issue also doesn't happen when running in the normal conhost window (e.g. win+R and then "cmd.exe"), and it also doesn't happen when running in the new conhost (double-click "OpenConsole.exe" at the windows-terminal dir), and, as mentioned, it doesn't happen in the windows terminal.

I also had cases where I closed the small window manually, and then after closing the terminal window there remained some zombie process which prevents deleting the terminal exe file on disk., and iirc i had to kill the correct conhost process to allow deleting ther terminal binary (after finding which process has a handle to the terminal executable file)..

However, I can't reproduce this issue right now.

That's it for now.

So, first thing first, can you reproduce the issue?

rmyorston commented 4 months ago

I can't even get Alacritty to run. I've tried both the installer and the portable executable on Windows 10 and Windows 11 but it does absolutely nothing.

Perhaps it doesn't like the graphics in my virtual machines.

avih commented 4 months ago

I'm guessing most of them claim to have super fast gpu accelrated rendering (which really doesn't matter typically) so they can't find a good enough gpu in your VM. Modern apps are like that... (wayland in a linux vm is usually similarly bad. "must have gpu!!!1!").

I'm just looking for something to use instead of windows-terminal, and alacritty almost fits the bill (tiny and good, but subpixel antialiasing is broken when the subpixel order is not RGB).

Anyway, in alacritty and all the others I always get the "portable"/zip thing if available.

I just reproduced in "contour". I got the zip release here https://github.com/contour-terminal/contour/releases.

Same with the wez zip (both wezterm.exe and wezterm-gui.exe) here https://github.com/wez/wezterm/releases

Same with rio-portable.exe here https://github.com/raphamorim/rio/releases

There are more windows terminals here https://github.com/cdleon/awesome-terminals (the terminals above use the new conpty thing, which I think is what we need, but some don't e.g. conemu is just a conhost wrapper hack).

avih commented 4 months ago

Even if you can't run it, can you think of anything that sh does with the console - even when non-interactive, but other applets don't?

avih commented 4 months ago

Actually I just reproduced it also in vscodium (liberated build of vscode), got the zip here https://github.com/VSCodium/vscodium/releases

I modified my key bindings, but I think the default to open the internal terminal is ctrl-backtick or ctrl-shift-backtick - https://code.visualstudio.com/docs/terminal/basics

This uses electron which I think uses the chrome rendering engine, which I hope can work without a GPU...

rmyorston commented 4 months ago

I eventually managed to get rio working, after a fashion.

There's a hidden conhost.exe associated with it. Presumably this is how console applications communicate with it.

The busybox-w32 shell has the noconsole and noiconify options which try to control the visibility of the console window. The console host being hidden confuses the issue because the shell default is to iconify the window when it's to be concealed.

A workaround is to add -o noiconify to the shell command to force the option to match the actual state of the console host.

A more permanent solution may be to make the shell a bit less fussy. If we want the console host to be hidden and it's actually iconified that's probably good enough, and vice versa.

The shell will still respect the noiconify setting when it conceals the console itself.

A patch for this is:

diff --git a/shell/ash.c b/shell/ash.c
index 15c0e56a3..4b063217a 100644
--- a/shell/ash.c
+++ b/shell/ash.c
@@ -2981,17 +2981,10 @@ static int console_state(void)
    DECLARE_PROC_ADDR(BOOL, ShowWindow, HWND, int);

    if (INIT_PROC_ADDR(user32.dll, ShowWindow)) {
-       BOOL state;
+       BOOL visible = IsWindowVisible(GetConsoleWindow());
+       BOOL iconified = IsIconic(GetConsoleWindow());

-       if (noiconify) {
-           state = IsWindowVisible(GetConsoleWindow());
-           if (state >= 0)
-               return state == 0;
-       } else {
-           state = IsIconic(GetConsoleWindow());
-           if (state >= 0)
-               return state != 0;
-       }
+       return !visible || iconified;
    }
    return 2;
 }
avih commented 4 months ago

Thanks.

I assume you were able to reproduce the issue?

Can't say I understand exactly the context of the issue, but am I correct to understand that by default sh inspects/touches the windows state even when none of the visibility/iconify options are specified by the user?

I'd think that by default it should let it be, and if the user specifies yes/no visible/iconic then inspect and maybe touch the state. No?

If we want the console host to be hidden and it's actually iconified that's probably good enough, and vice versa.

Not sure who is "we" and what the context is, but strictly on a technical level, I'd think that hidden and iconified are clearly different, and if someone wants one of those states, the other state might not be considered "good enouh".

Regardless, are the option -o noconsole to iconify the window, and -o noconsole -o noiconify to hide it? if yes, does -o noiconfiy on its own do anything?

Not that I care too much, but I'd think a pair like hidewindow and iconify would be easier to grasp, and clearly(?) map to the two individually control states.

It's indeed a good question what to do on init, but I'd think that do nothing. If the user queries the state, like set +o, then simply inspect each of their current states, and if the user specifies a state, like set +o hidewindow or set -o iconfy then set the relevant state accordingly.

I.e. the option value is never kept in any busybox var. Set calls the API with the new state, and query queries the API, and Windows itself is the "storage" of these options values.

A patch for this is: ...

I can confirm the patch seems to fix the issue, at least with alacritty. Thanks.

(but I'm not yet replacing my main busybox.exe binary because I need to address some rebasing issues with my native unicode support patches...)

Is the patch "the more permanent solution"? (I presume yes, and that the temp solution is to use -o noiconify ?)

ale5000-git commented 4 months ago

@avih See my previous suggestions here: https://github.com/rmyorston/busybox-w32/issues/325#issuecomment-2084124531 and here: https://github.com/rmyorston/busybox-w32/issues/325#issuecomment-2084146528

rmyorston commented 4 months ago

@avih Yes, I was able to reproduce the issue. You are correct regarding the workaround and the more permanent solution.

There's no plan to change the names of the options.

Not sure who is "we"

In this case there are (at least) two actors involved: the user of the shell and the owner of the conhost.exe process (the third-party terminal). They may have different and irreconcilable ideas about the visibility of the console window. And, indeed, the purpose of said window.

The proposed patch is a pragmatic attempt to have something that works in this new situation.

The noconsole option indicates whether or not the console is visible; noiconify indicates the manner in which the console is to be made invisible.

avih commented 4 months ago

The noconsole option indicates whether or not the console is visible; noiconify indicates the manner in which the console is to be made invisible.

So it's well defined, and I assume that noiconify only matters when noconsole is in effect.

If we want the console host to be hidden and it's actually iconified that's probably good enough, and vice versa.

I also agree that if the console is either hidden or iconified (or both?) then noconsole should be considered set.

But I'm still not sure at which context "want hidden but accept iconified, and vice verse" applies.

Does it mean that if noconsole is set and the state is either hidden or iconified, then both -o noiconify and +o noiconify would be ignored?

Or maybe the context is only at startup when we create the initial internal state of these options?

In this case there are (at least) two actors involved: the user of the shell and the owner of the conhost.exe process (the third-party terminal). They may have different and irreconcilable ideas about the visibility of the console window.

Yes. This is an issue, because the console/iconify options apply to conhost but not to the 3rd party terminal, but it's only an issue if you want to maintain an internal state (and expect it to match the actual state at all times).

It would not be an issue, I think, if these options are completely virtual, i.e. their states are not stored in busybox, but rather at the window manager. So querying the options queries the API, and changing them uses the API to set the new state.

This would also mean that there's no need to inspect or change them on startup. And if the user sets one of them (as startup options or at any later time) then still there's no need to inspect - just set them via the windows API.

So that the only need to inspect arrise when the user queries these options.

The main thing to avoid, IMO, is trying to set the window state on init when console/iconify startup options were not specified.

Do you agree that would be better, with less issues, and more consistent, than maintaining an internal state?

If yes, then the choice to not make it behave like so is implementation details, because it's simpler and hopefully good enough to maintain an internal state and compare it to the actual state as needed?

Another approach might be to try and detect whether we're in "conpty" mode where conhost is not the actual window which the user interacts with (the window could be 3td party terminal, or no window in an ssh session, or some app which opens conpty but doesn't display a window, etc)

If we can detect that, then it might be great to behave as if these two options are readonly, and during queries return whatever (querying conhost would be fine I think).

Maybe the fact that the initial state is hidden (regardless of minimized/iconified) could be a valid trigger to make noconsole and noiconify "readonly" for this session of the shell?

rmyorston commented 4 months ago

Let's see if I can clarify:

(Using 'conceal/reveal' to avoid confusion ;-)

avih commented 4 months ago

I understand, and it does clarify. I didn't realize noconsole is immediate and noiconify applies the next time noconsole is applied. Thanks.

but it doesn't answer

But I'm still not sure at which context "want hidden but accept iconified, and vice verse" applies.

And it also wasn't explained why it touches the state on init even when neither -o noconsole nor +o noconsole is specified as a startup option.

And whether it might be useful to consider noconsole readonly in cases where we know or suspect that conhost is not the actual window, for instance maybe if the initial actual window state is hidden.

rmyorston commented 4 months ago

The state of the noconsole option is set to the actual state of the console host on startup. The console host is only concealed/revealed when its actual state differs from that requested by noconsole. So it shouldn't be altered on startup.

The problem with the third-party terminals arose because the function to determine the actual state needlessly distinguished between iconified/hidden. But the patch fixes that.

Since the patch seems to work (for Alacritty and rio, at least) I've applied it. There are prereleases, but everyone here knows that.

It would be good to prevent users from affecting the hidden console hosts, but I haven't yet found a reliable way to detect them.

ale5000-git commented 4 months ago

@rmyorston Hi, unfortunately it isn't completely fixed.

How to reproduce:

rmyorston commented 4 months ago

@ale5000-git I fetched your repo and installed VS Code. It wanted some other stuff, like a JDK, so I got that too.

I see there's a .vscode/tasks.json file with entries for the tasks you mention. How do I run them?

rmyorston commented 4 months ago

@ale5000-git OK, I found how to run tasks: it's in the Command Palette. Then I had to add the microg-unofficial-installer-main/tools/win directory to PATH so it could find busybox.exe and zip.exe.

I ran the two tasks, but didn't see a "small window".

I deleted the microg-unofficial-installer-main directory, reinstalled it and tried again: still no "small window".

ale5000-git commented 4 months ago

@rmyorston The PATH modifications are done automatically, no need to do that manually. You need to open the folder of the repo with VS Code, not a single file: open-with-code Note: if you extract the zip normally you may end up with microg-unofficial-installer-main/microg-unofficial-installer-main, you should use the inner most folder otherwise it won't work.

Then the ... menu => Terminal => Run activity => Select the activity run-activity

ale5000-git commented 4 months ago

The task "buildOta" doesn't have any issue but it need to run to be able to use the task "installTest".

The task "installTest": installtest

rmyorston commented 4 months ago

When I didn't have microg-unofficial-installer-main/tools/win in PATH I got this:

terminal

Initially I tried fixing it by adding my home directory to PATH, because that's where I keep the latest prerelease. That prevented the "BusyBox is missing" problem, but then it failed to find zip.exe so I decided to use that tools/win directory instead. The "installTest" task then worked, but with no "small window".

If I add a directory to PATH which contains an older busybox.exe before tools/win I get the "small window" problem.

Can you check your PATH to make sure you don't have a directory on it containing an older busybox.exe?

ale5000-git commented 4 months ago

Azz, sorry it was a stupid error. I was using the recovery script inside the cmdline script that set PATH automatically so it was fine here, but when I configured it to being run directly from VS Code I didn't remembered that fact and I also had an older busybox in PATH.

Now I have removed every custom executable from path and I confirm everything is working fine after I have fixed the code.

avih commented 4 months ago

The state of the noconsole option is set to the actual state of the console host on startup. The console host is only concealed/revealed when its actual state differs from that requested by noconsole. So it shouldn't be altered on startup.

I won't pursue this further, but I still think this subjects sh unnecessarily to bugs with the implementation of noconsole and noiconify even when the user and the shell don't care about any of those.

For instance, busybox echo or busybox grep were rightly unaffected because they don't care about the console state.

And I think so should be the shell unless one of the window-related options is used by the user. But currently it performs window state tests and potential adjustment even when the window state is don't-care both by the user and the sh.

Yes, this specific case is now fixed. But it's entirely possible to return tomorrow when a different combo of window state properties is unexpected by the code. In such case, it would be better if failure can only be observed when noconsole or noiconify are used, IMO, because, I'm guessing, in the vast majority of cases they're not used.