tig / mcec

Robust remote control of Windows PCs over the network.
https://tig.github.io/mcec/
MIT License
80 stars 13 forks source link

Inconsistent interactions between shiftdown/up and chars: commands #14

Closed hotelfoxtrotnovember closed 4 years ago

hotelfoxtrotnovember commented 4 years ago

Sorry, I've encountered one more issue that I think may be a bug, but I'm not 100% sure what the intended behavior should be. As a follow-on to #12 and #13, I thought I understood the relationship between shiftdown/up and single character (SendInput) commands. Similarly, I thought I understood how chars: commands (Chars) worked within that context--that is:

shiftdown:shift
a
chars:a
shiftup:shift

would result in "Aa" being typed in the foreground window and this is the behavior I see with the current version. However, that is not the behavior I see with shiftdown:alt, hence my confusion as to what is supposed to happen. For example,

shiftdown:alt
f
shiftup:alt

and (after clearing back with a couple of escapes)

shiftdown:alt
chars:f
shiftup:alt

both bring up the file menu when notepad is the foreground application. Which behavior is correct? Should chars: commands be impacted by shift/control/alt/win being depressed when received or should they ignore those modifier keys? If it matters, I observed similar behavior to the shift case when shiftdown:ctrl was used. That is, the find menu was brought up for "f" but the letter f was typed in notepad with "chars:f"

After re-reading the documentation a number of times, I think the shift/ctrl behavior is correct and alt is wrong (I did not test l/rwin because I'm less familiar with those shortcuts), but I apologize if I still don't completely have my head around the various command types.

Thanks!

tig commented 4 years ago

Ooh. This is good.

chars: works as follows:

1) It un-escapes the text using Regex.Unescape. Thus \u0020 turns in to a `. 2) It then callsInputSimulator().Keyboard.TextEntry(text)which 3) Creates two ofINPUTstructures for each character (one for keydown and one for keyup). These all go in an array... 4) which is passed to the WindowsSendInput` API

https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-sendinput

The SendInput function inserts the events in the INPUT structures serially into the keyboard or mouse input stream. These events are not interspersed with other keyboard or mouse input events inserted either by the user (with the keyboard or mouse) or by calls to keybd_event, mouse_event, or other calls to SendInput.

This function does not reset the keyboard's current state. Any keys that are already pressed when the function is called might interfere with the events that this function generates.

When I read the above, I would assume if the array of chars included an a it would be wrapped in an INPUT structure (whatever that is) as the virtual key-code for A (VK_A). However, evidence proves this is not the case:

image

image

Digging in deeper (I didn't write the InputSimulator code; it was open source), I see that this code is used to create the down/up INPUTs:

public InputBuilder AddCharacter(char character) {
            UInt16 scanCode = character;

            var down = new INPUT();
            down.Type = (UInt32)InputType.Keyboard;
            down.Data.Keyboard = new KEYBDINPUT();
            down.Data.Keyboard.KeyCode = 0;
            down.Data.Keyboard.Scan = scanCode;
            down.Data.Keyboard.Flags = (UInt32)KeyboardFlag.Unicode;
            down.Data.Keyboard.Time = 0;
            down.Data.Keyboard.ExtraInfo = IntPtr.Zero;

            var up = new INPUT();
            up.Type = (UInt32)InputType.Keyboard;
            up.Data.Keyboard = new KEYBDINPUT();
            up.Data.Keyboard.KeyCode = 0;
            up.Data.Keyboard.Scan = scanCode;
            up.Data.Keyboard.Flags = (UInt32)(KeyboardFlag.KeyUp | KeyboardFlag.Unicode);
            up.Data.Keyboard.Time = 0;
            up.Data.Keyboard.ExtraInfo = IntPtr.Zero;

            // Handle extended keys:
            // If the scan code is preceded by a prefix byte that has the value 0xE0 (224),
            // we need to include the KEYEVENTF_EXTENDEDKEY flag in the Flags property. 
            if ((scanCode & 0xFF00) == 0xE000) {
                down.Data.Keyboard.Flags |= (UInt32)KeyboardFlag.ExtendedKey;
                up.Data.Keyboard.Flags |= (UInt32)KeyboardFlag.ExtendedKey;
            }

            _inputList.Add(down);
            _inputList.Add(up);
            return this;
        }

The culprit are the lines that specify the ....Data.Keyboard.FlagsKeyboardFlag.Unicode flag. That flag is defined as

If specified, the system synthesizes a VK_PACKET keystroke. The wVk parameter must be zero. This flag can only be combined with the KEYEVENTF_KEYUP flag. For more information, see the Remarks section.

(I note that the first INPUT uses down.Data.Keyboard.Flags = (UInt32)KeyboardFlag.Unicode; which, according to the docs is invalid because KeyboardFlag.KeyUp is not included! Not relevant here, but interesting anyway.)

So... this explains it. As implemented, all text passed via chars: will be treated as Unicode character input. Thus the state of the shift key will not impact the results.

Now, why does shiftup:alt work?

I'm not as sure as precisely as I am with the above, but I assume it's because the code in Windows that reacts to alt and ctrl is different than the code that handles shift and looks like this:

 if (altKeyDown && toLower(key) == VK_F) 
     UserHitAltF();

In fact, I'm VERY sure that dialog box accelerators have code that looks like this because you can do this:

image

and this

image

and they both activate the Enable/Disable Commands radio button just like if you had typed alt-e or alt-shift-e.

HOWEVER, the code in Windows that deals with menu accelerators behaves differently: In any windows app alt-f opens the File menu, but alt-shift-f doesn't! In fact, I'm pretty sure THAT code looks like this:

 if (altKeyDown && !shiftKeyDown && toLower(key) == VK_F) 
     UserHitAltF();

This demonstrates it. First I do

alttab
pause:250
shiftdown:shift
shiftdown:alt
chars:F
shiftup:alt
shiftup:shift

And the File menu highlights, but doesn't appear.

Then I do

alttab
pause:250
shiftdown:alt
chars:F
shiftup:alt

And it works. The shell is saying "activate the menu with alt-f but NOT alt-shift-f.

Now that we understand it... Is there a reason you're not using <SendInput/> commands to do all of this?

tig commented 4 years ago

If I decide to fix this I have two choices, I think:

1) Fix the currently chars: implementation. 2) Add a new command, say keyboard: that behaves the way you expected.

tig commented 4 years ago

I've updated the dos with:

Note, how chars: behaves relative to the state of shift, alt, ctrl, and win keys is dependent on which modifier key is used and the application that is in the foreground. Specifically, shift is ignored and for the other modifier keys, the behavior is app dependent. Using <SendInput/> commands is recommended for fine-grain control of behavior that depends on modifier keys. See Issue #14 for more details.

hotelfoxtrotnovember commented 4 years ago

This is fascinating and a much bigger can of worms than I'd expected. I'm still trying to digest your explanation above, but to give more context, here is what I'm trying to do/why I'm not using <SendInput/> commands for all of this--I'm attempting to write a plugin for Home Remote that lets Home Remote interface elements (buttons, etc) send commands to MCEC in a straight-forward way. My use case for this plugin is to create a keyboard in Home Remote so that I can use it as a network keyboard to the HTPC running MCEC, thus all of the experimentation with the full scope of standard keyboard keys and in combination with modifier keys. So that the plugin is useful to others as well, I wanted to make it work with MCEC's out-of-box capabilities without them having to add any extra commands--I'm really interested in simulation of the keyboard as such.

As I discovered in #12 and #13, because MCEC's single character commands are really vk_ commands, and those are confusing/hard-to-understand for non-alphanumeric characters, I took the approach of simplifying things for other users (and myself) of the Home Remote plugin by working in terms of what appears to be typed on a standard US keyboard (for example, backtick instead of VK_OEM3, which might not even be backtick on other keyboard formats). That is, regardless of what their keyboard format calls it/would send it as, if they wanted to type a backtick, they would send a backtick command to the plugin to get a backtick typed by MCEC. I achieved that by explicitly sending the backtick with chars:<backtick> (can't get that to format correctly), which worked great with all of the non-alphanumeric characters and was simpler than messing with the `vk` codes for those keys.

To address the fact that shiftdown/up did not alter chars:-based commands, I implemented the shifting behavior manually in the plugin. So when a shiftdown:shift was received, the plugin was smart enough to know to send chars:~ when a backtick command was subsequently sent to the plugin. The plugin tracks the state of the modifiers in addition to sending the shiftup/down commands to MCEC to accomplish this. I basically viewed chars: as kind of a bypass, sending literal values that were not impacted by modifier keys.

That all worked great until I started testing the control and alt modifiers functionality, which led to this issue. While chars:-based inputs ignored the shift modifier, they did not ignore the alt modifier. I'm still struggling to get my head around why that would be. I still don't understand why the processing of modifiers is different at the SendInput stage. I understand your point about how the code might process such keystrokes differently once they've been extracted from the queue, but I don't understand how they can get added to the queue in a way that causes the behavior we're seeing. Said another way: why does shiftdown:alt add something to the queue that affects unicode characters and shiftdown:shift and shiftdown:ctrl do not? I'm quite out of my depth with Windows input processing, the details of the API, and how MCEC processes input, but it seems to me that if the goal is to simulate key presses, that simulation should match the actual behavior of the keypresses. That alt-shift-f would be handled differently by different applications as you showed, doesn't alter that what the application took from the queue/got from whatever was processing the queue, said that alt-shift-f were the keys that were pressed, right?

I apologize for being so long-winded and probably not fully grasping some of the core concepts, but perhaps some of the above will at least convey how I'm thinking about things and maybe you can see the flaw and/or see the most elegant way to address this. I really appreciate your time with this!

tig commented 4 years ago

It's all good. This is fun for me. I literally helped design some of this Windows innards stuff in the early 1990s and it's hilarious having it all come back.

Think of it this way:

If I changed chars: to act like a series of <SendInptutCommands/> (or added a new keys: command that did so), it may not do what you expect either: because it needs to translate each character into a VK code. There is no VK code for <backtick>. On US keyboards <backtick> is mapped to VK_OEM_3, but that may differ in Europe where the keyboards are laid out differently.

This is why if you send <backtick> as a single char, I end up sending VK_NUMPAD0. The only way for me to address this would be to write a US specific decoder that does the inverse of VK -> char mapping that Windows already does. This will only work in the US.

What I recommend you do is for each key on your virtual keyboard to be assigned a VK code, NOT a character. Then for all the non-shift keys, just send the text VK_<code>. All VK_ codes for Windows are already pre-defined as <SendInputCommand/> commands so you don't need to do anything for your users except enable them (I could make enabling sub-sets of commands easier for you if that helps; you could also literally give them a MCEControl.commands file that was pre-created and just copy it into %appdir%/Kindel Systems/MCE Controller).

For the shift keys you'd send the appropriate `shiftup/shiftdown:" commands.

The list of VK codes is here:

https://docs.microsoft.com/en-us/windows/win32/inputdev/virtual-key-codes

hotelfoxtrotnovember commented 4 years ago

That explanation was very clear and helpful, thank you! I think I follow 95% of what's going on. After sleeping on it, I had a few thoughts.

First, a quick clarification to confirm my understanding of one thing in relation to your third bullet point: because chars: is currently actually sending a literal Unicode value (instead of a vk_ code), is the reason that the behavior we observe is dependent on the receiving application due to the receiving applications varying in how they handle Unicode characters? I preface this by saying that it has been a really long time since I did any Windows UI programming and my general areas of expertise are more on the networking side and not the UI side anyway. But, if we have set the appropriate flag for alt or ctrl, and then instead of providing vk_f we provide Unicode f, is it possible that the code handling the ctrl-f functionality may not be written to work on Unicode f (but only vk_f) whereas the code handling the alt-f functionality handles both vk_f and Unicode f correctly/as-expected? This assumes that the code handling those is likely in different parts of the receiving application (and hence the discrepancy), but that seems possible to me. The menu accelerators (alt-f) are provided by the OS, I believe you said above, whereas the code to handle the ctrl-f shortcut may be part of the application framework instead (or just another part of the OS written by other people)? This seems plausible to me and explains the behavior I observed in notepad, where both shiftdown:alt f and chars:f brought up the file menu and shiftdown:ctrl f brought up the find dialog but chars:f just typed an f.

I think there is value in the ability to specify a string of characters that get treated as if they were typed one after the other as single character commands, at least as a convenience/clarity thing. As that is what you had intended for chars:, it seems like a good candidate to alter its behavior to what you thought it was anyway. I will toss out one possible problem that occurred to me with this, which is that currently if you were to be using, for example, chars:http://www.microsoft.com/ in an existing setup that is talking to MCEC, I think it would fail if you changed the behavior of chars: because the :, /, and . components would then not map to the proper vk_ codes.

I also see value in an easy mechanism for sending literal values, because I think that in general users would like to be able to literally send what they want typed on the keyboard without having to worry about any complexity of vk_ codes. I think something like a literals: command might be useful in some situations. Sending things as Unicode seems like the simplest approach to provide this, since we know it already gets passed through okay (at least when modifiers are not in play). Unfortunately, it has the problem of potentially being handled incorrectly by the receiving application, as observed and potentially for the reason I speculated above, but that could be addressed by a clear caveat in the documentation--use literals: with modifiers at your own risk.

Regarding a US-specific decoder, I tend to think that's not the right way to go, but in reading up on how SendInput works, I came across something that might be helpful. However, I don't have any kind of development environment to even attempt to test this out and it is possible you are already aware of it, so I apologize for not having done a little more legwork myself to test it out before proposing it. MapVirtualKey can be used to get a mapping between vk_ codes and character values ([https://docs.microsoft.com/en-us/windows/win32/api/winuser/nf-winuser-mapvirtualkeya]()). Could you, at startup, generate a table for all (or at least the relevant) vk_ codes to get the corresponding character code as that is mapped for the system's current keyboard setup. That way you would have a decoder for everyone's setup (not US-specific) and you could use it for processing the single character commands more intuitively for users. Thus, <backtick> as a single character command would map to VK_OEM_3 if that was the right thing for their setup, but they could still send a <backtick> and not have to worry about what vk_ code it maps to? If this makes sense, it may actually make all the rest of the above go away (and would eliminate the problem I pointed out about changing chars: behavior), right?

Who knew that inputs/key-presses were so complicated? :) I hope the above flows fairly well--by writing it down it helped me think through things, but I ended up cutting some ideas as I realized they wouldn't work.

Lastly, regarding enabling sub-sets of commands easier, that might be nice just as a general feature--as I was going through and turning things on, I found I had to do a lot scrolling because the vk_ commands are alphabetized and not grouped more logically internally. Looking for vk_up, vk_down, vk_left, and vk_right, for example, was inconvenient, but not that big a deal. Maybe if things were just ordered more logically, that would help (all alpha stuff first, grouping related things, etc). Or perhaps just turning on all vk_ commands with a single checkbox, akin to how mouse: works? That's really just a one-time kind of problem, however, and lacking a good solution, I'm fine with whatever. It might be nice to easily get a list of what commands are currently enabled, though (a hide disabled/unchecked toggle, or something?).

tig commented 4 years ago

For now I'm going to leave things as-is. Let me know if you really need me to change something to enable your solution to work.