keyboardio / Kaleidoscope

Firmware for Keyboardio keyboards and other keyboards with AVR or ARM MCUs.
http://keyboard.io
GNU General Public License v3.0
757 stars 258 forks source link

high-level design changes for Boot/NKRO switching #1305

Closed tlyu closed 10 months ago

tlyu commented 1 year ago

What do you want to change?

Clarify the Boot vs NKRO functionality and implementation. This is mostly me thinking out loud and trying to capture a few things before possibly suggesting some more narrowly scoped changes.

Background

The host probably considers the Boot Keyboard and the NKRO Keyboard to be two independent keyboard interfaces (sub-devices) within a single USB device. (Technically, the NKRO Keyboard is a single report within a multi-report HID device interface.)

The Boot Keyboard only produces HID Boot Protocol reports, regardless of what protocol the host has requested with the Set Protocol request. This is fine, and probably best for backwards compatibility. The NKRO Keyboard only produces NKRO reports, regardless of what protocol the host has requested with the Set Protocol request. This is probably wrong from a HID compliance standpoint. (The NKRO Keyboard shouldn't even honor requests to switch to Boot Protocol, because it currently lacks code to produce Boot Protocol reports. Also, being a multi-report HID interface, it couldn't send any of its other reports, because Boot Protocol reports can't have Report IDs.)

Kaleidoscope has quirks that cause the behavior of these two keyboards to be interrelated, by selectively routing higher level method calls to them. Roughly, if the protocol variable in the Boot Keyboard is set to the Boot Protocol, whether by Kaleidoscope or the host, Kaleidoscope will stop updating the NKRO Keyboard with key events, and stop asking the NKRO Keyboard to send reports. Likewise, if the protocol variable in the Boot Keyboard is set to the Report Protocol, Kaleidoscope will stop updating the Boot Keyboard with key events, and stop asking the Boot Keyboard to send reports.

There is some additional background here

How will it make Kaleidoscope better?

It might make Kaleidoscope behave a bit better around host wakeups from sleep or hibernate states. Waking up from deeper sleep or hibernation seems likely to be a situation where the BIOS or preboot environment is active, before restoring OS runtime state to RAM from mass storage.

It might also improve compatibility with KVMs and docking stations, which might expect a Boot Keyboard to actually send key events, instead of all key events going over a multi-report HID interface.

What trouble might users have in adapting to the new functionality?

It's theoretically possible that some users might prefer to always start in NKRO mode, but that seems unlikely. Currently, the state variables to switch between Boot Protocol and Report Protocol also have the effect of deactivating the Consumer Control reports, but that can be changed. (Note that we still want to add timeouts, in case those reports also get blocked because of a BIOS that only talks Boot Protocol and isn't polling the multi-report interface.)

What trouble might developers have in adapting to the new functionality?

It will shift around some abstractions. Hopefully the changes will all improve readability and maintainability of the code.

Specific host behavior

FreeBSD

OpenBSD

macOS

Windows 10

gedankenexperimenter commented 1 year ago
  • Track key events on both the Boot Keyboard and the NKRO Keyboard, but have Kaleidscope select which ones to send reports on.

Possibly worth considering: When I wrote an experimental version of Kaleidoscope a few years ago, I had it store only one NKRO report, regardless of which protocol it was using, and if Boot Protocol was being used, I just had it translate the report on the fly.

tlyu commented 1 year ago

See #1314 for some Windows-related observations that might influence this design.

theAlexes commented 11 months ago

Hi all, per suggestion a in this link https://www.devever.net/~hl/usbnkro - it is theoretically possible to always send the boot protocol in the report, but mark it as padding in the HID descriptor, thus completely eliminating the need to switch protocols. Has anyone considered this implementation?

obra commented 11 months ago

I know it's something we'd considered in the past and I recall that it wasn't as cut and dried as the proposal made it sound, but I don't see good, clear notes about why. It might be worth someone trying out.

On Mon, Nov 20, 2023 at 11:23 AM atax1a @.***> wrote:

Hi all, per suggestion a in this link https://www.devever.net/~hl/usbnkro

  • it is theoretically possible to always send the boot protocol in the report, but mark it as padding in the HID descriptor, thus completely eliminating the need to switch protocols. Has anyone considered this implementation?

— Reply to this email directly, view it on GitHub https://github.com/keyboardio/Kaleidoscope/issues/1305#issuecomment-1819665691, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALC2CB5BGP24TX2DQRCJDYFOU4LAVCNFSM6AAAAAARYWD4VCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJZGY3DKNRZGE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

theAlexes commented 11 months ago

We would be glad to test any experimental branches out on OS X (where the Keyboardio works fine) and FreeBSD, which currently seems to want a kernel option set in order to make the Keyboardio switch to HID protocol correctly.

tlyu commented 11 months ago

One problem with prepending the boot protocol keyboard report to the bitmap report and marking it as padding is that some hosts or KVMs might want a boot keyboard to have a report of exactly 8 bytes. They might also check that the endpoint length is 8 bytes. (edit: this is either not an issue, or is already a problem with Model 01 and Atreus, because those advertise 64-byte packets for the boot keyboard endpoint)

Appendix B of USB HID 1.1 says that a boot device report may not exceed 8 bytes, which I can interpret to mean even in its non-boot report format.

I guess we could take the approach in https://static.wongcornall.com/ibm-capsense-usb-web/ibm-capsense-usb.html#x1-160003.3.2 and have both the boot keyboard interface and the NKRO (part of the multi-report interface) always produce reports, but the boot keyboard report is marked as padding and ignored by HID-aware hosts.

These approaches also increase bandwidth usage, due to sending two HID reports or a larger HID report on each key event. It might not adversely affect the total bus bandwidth, but it could increase key latency (but maybe not by a lot).

obra commented 11 months ago

In general I wouldn't expect the added latency to be human-perceptible. Compared to what's shuffled across USB busses these days, it's just not that much data. I would be moderately surprised if implementations choked in the way that you describe, Taylor. But not enough that I'd bet money against it.

But I think the right thing to do is for somebody interested to try this out..if it works at all, we'll need to see how it performs on "weirder" bios/preboot environments.

On Mon, Nov 20, 2023 at 12:50 PM Taylor Yu @.***> wrote:

One problem with prepending the boot protocol keyboard report to the bitmap report and marking it as padding is that some hosts or KVMs might want a boot keyboard to have a report of exactly 8 bytes. They might also check that the endpoint length is 8 bytes.

Appendix B of USB HID 1.1 says that a boot device report may not exceed 8 bytes, which I can interpret to mean even in its non-boot report format.

I guess we could take the approach in https://static.wongcornall.com/ibm-capsense-usb-web/ibm-capsense-usb.html#x1-160003.3.2 and have both the boot keyboard interface and the NKRO (part of the multi-report interface) always produce reports, but the boot keyboard report is marked as padding and ignored by HID-aware hosts.

These approaches also increase bandwidth usage, due to sending two HID reports or a larger HID report on each key event. It might not adversely affect the total bus bandwidth, but it could increase key latency (but maybe not by a lot).

— Reply to this email directly, view it on GitHub https://github.com/keyboardio/Kaleidoscope/issues/1305#issuecomment-1819783457, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALC2AGZF4U5YELVXM55L3YFO7ADAVCNFSM6AAAAAARYWD4VCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJZG44DGNBVG4 . You are receiving this because you commented.Message ID: @.***>

tlyu commented 11 months ago

This is an observation about HID standard conformance of NKRO. A reasonable reading of HID 1.1 Appendix C forbids the use of full key bitmaps for keyboards. For context, these requirements are not limited to Boot Keyboards.

Non-modifier keys must be reported in Input (Array, Absolute) items. Reports must contain a list of keys currently pressed and not make/break codes (relative data).

In practice, full bitmap NKRO seems to work in many host implementations, but KVM, etc. vendors who claim that our NKRO keyboard report is non-conforming might be technically correct. I'm not sure why that requirement is in there; maybe there were vendors who didn't want to implement reading a full bitmap keyboard for complexity reasons? There do seem to be enough other bitmap NKRO keyboards out there that this requirement is ignored in practice.

obra commented 11 months ago

Are there actual claims that the bitmap report is nonconforming or just KVMs (that themselves are doing possibly noncompliant things) that don’t work right?

On Sun, Nov 26, 2023 at 11:49 AM Taylor Yu @.***> wrote:

This is an observation about HID standard conformance of NKRO. A reasonable reading of HID 1.1 Appendix C forbids the use of full key bitmaps for keyboards. For context, these requirements are not limited to Boot Keyboards.

Non-modifier keys must be reported in Input (Array, Absolute) items. Reports must contain a list of keys currently pressed and not make/break codes (relative data).

In practice, full bitmap NKRO seems to work in many host implementations, but KVM, etc. vendors who claim that our NKRO keyboard report is non-conforming might be technically correct. I'm not sure why that requirement is in there; maybe there were vendors who didn't want to implement reading a full bitmap keyboard for complexity reasons? There do seem to be enough other bitmap NKRO keyboards out there that this requirement is ignored in practice.

— Reply to this email directly, view it on GitHub https://github.com/keyboardio/Kaleidoscope/issues/1305#issuecomment-1826833078, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALC2DBOA2JQS7N5BHA5ILYGNXIBAVCNFSM6AAAAAARYWD4VCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWHAZTGMBXHA . You are receiving this because you commented.Message ID: @.***>

tlyu commented 11 months ago

Are there actual claims that the bitmap report is nonconforming or just KVMs (that themselves are doing possibly noncompliant things) that don’t work right?

I think I remember someone mentioning on Discord that their KVM vendor made claims like that. I don't remember the exact wording.

obra commented 11 months ago

I know we’ve seen Kaya vendors that absolutely don’t support any kind of bitmap report

On Sun, Nov 26, 2023 at 1:12 PM Taylor Yu @.***> wrote:

Are there actual claims that the bitmap report is nonconforming or just KVMs (that themselves are doing possibly noncompliant things) that don’t work right?

I think I remember someone mentioning on Discord that their KVM vendor made claims like that. I don't remember the exact wording.

— Reply to this email directly, view it on GitHub https://github.com/keyboardio/Kaleidoscope/issues/1305#issuecomment-1826858297, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAALC2B5PHLZ4UIT6AELKRDYGOBCLAVCNFSM6AAAAAARYWD4VCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMRWHA2TQMRZG4 . You are receiving this because you commented.Message ID: @.***>

tlyu commented 11 months ago

@theAlexes I've linked the draft PRs that have a proof of concept fix, if you want to try them out. I'm definitely interested to hear how this behaves on FreeBSD.

Note you'll need to update both your KeyboardioHID and Kaleidoscope repositories. If you would prefer a firmware snapshot, I can probably do that, if you let me know your keyboard model. (I've only tested on Model 100 so far.)

theAlexes commented 11 months ago

Preliminary results with these patches applied are promising! The keyboard is usable without adjusting the loader parameter, making it much easier for us to set the loader parameter in the first place. :) This did change something about how the keyboard identifies to Wayland making our per-keyboard customization not work, though (not a major breaker, though).

For reference, the old firmware appears in dmesg on freebsd 13.2-RELEASE without the tunable set as so:

ugen0.3: <Keyboardio Model 01> at usbus0
umodem0 on uhub0
umodem0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
umodem0: data interface 1, has no CM over data, has break
ukbd1 on uhub0
ukbd1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
kbd3 at ukbd1
uhid0 on uhub0
uhid0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1 on uhub0
ums1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1: 8 buttons and [XYZT] coordinates ID=1

While with the draft PRs checked out it enumerates/attaches as so:

ugen0.3: <Keyboardio Model 01> at usbus0
ukbd1 on uhub0
ukbd1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
kbd3 at ukbd1
uhid0 on uhub0
uhid0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1 on uhub0
ums1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
ums1: 8 buttons and [XYZT] coordinates ID=1
umodem0 on uhub0
umodem0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 3> on usbus0
umodem0: data interface 1, has no CM over data, has break

we'll update later with what it looks like with the tunable set.

theAlexes commented 11 months ago

With the tunable set, the old keyboard and new keyboard both show up with the same dmesg:

ugen1.5: <Keyboardio Model 01> at usbus1
umodem0 on uhub3
umodem0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
umodem0: data interface 1, has no CM over data, has break
usbhid0 on uhub3
usbhid0: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
hidbus0: <HID bus> on usbhid0
hkbd0: <Keyboardio Model 01> on hidbus0
kbd2 at hkbd0
usbhid1 on uhub3
usbhid1: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
hidbus1: <HID bus> on usbhid1
hms0: <Keyboardio Model 01 Tablet> on hidbus1
hms0: 8 buttons and [XYW] coordinates ID=0
usbhid2 on uhub3
usbhid2: <Keyboardio Model 01, class 239/2, rev 2.00/1.00, addr 5> on usbus1
hidbus2: <HID bus> on usbhid2
hms1: <Keyboardio Model 01 Mouse> on hidbus2
hms1: 8 buttons and [XYWH] coordinates ID=1
hsctrl0: <Keyboardio Model 01 System Control> on hidbus2
hkbd1: <Keyboardio Model 01> on hidbus2
kbd3 at hkbd1
hcons0: <Keyboardio Model 01 Consumer Control> on hidbus2

The other weird corner case that wasn't quite working was trying to run OpenBSD on a macbook pro. Let me try that one too.

tlyu commented 11 months ago

Thanks for the detailed updates!

With the tunable set, the old keyboard and new keyboard both show up with the same dmesg:

Does this tunable enable some non-default kernel drivers that know how to deal with multi-report HID interfaces?

theAlexes commented 11 months ago

Yeah, that looks to be what the mentioned tunable does on FreeBSD.

It looks like the previous arrangement, however, works better on OpenBSD — refreshing our memory, it was the built in macbook keyboard that wasn't working there.

OpenBSD seems to prefer the previous report structure — a keyboard flashed with the previous release enumerates as follows (typed by hand, as the machine at hand isn't connected to the network yet):

uhidev0 at uhub0 port 6 configuration 1 interface 2 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/1
ukbd0 at uhidev0
wskbd0 at ukbd0: console keyboard, using wsdisplay0
uhidev1 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/0
uhhid at uhidev1 not configured
uhidev2 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev2: iclass 3/0 8 report ids
uhid at uhidev2 reportid 1 not configured
uhid at uhidev2 reportid 4 not configured
uhid at uhidev2 reportid 5 not configured
ukbd1 at uhidev2 reportid8
wskbd1 at ukbd1 mux 1
wskbd1: connecting to wsdisplay0
"Keyboardio Model 01" rev 2.00/1.00 addr 3 at uhub6 port 6 configuration 1 not configured

Despite the 'not configured' lines, this all works fine.

The new firmware gives the following, similar except for the "no usable key codes array" error leading it to attach to a different keyboard. It also sort of works as a console keyboard but the keystrokes are laggy and typing quickly mixes them up. Macros don't suffer from keystrokes getting mixed up, but they do lag heavily. I'm also unsure if openbsd has a similar tunable to load similar drivers.

uhidev0 at uhub0 port 6 configuration 1 interface 2 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/1
ukbd0 at uhidev0: no usable key codes array
uhidev1 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev0: iclass 3/0
uhhid at uhidev1 not configured
uhidev2 at uhub0 port 6 configuration 1 interface 3 "Keyboardio Model 01" rev 2.00/1.00 addr 3
uhidev2: iclass 3/0 8 report ids
uhid at uhidev2 reportid 1 not configured
uhid at uhidev2 reportid 4 not configured
uhid at uhidev2 reportid 5 not configured
ukbd1 at uhidev2 reportid8
wskbd0 at ukbd1: console keyboard, using wsdisplay0
"Keyboardio Model 01" rev 2.00/1.00 addr 3 at uhub6 port 6 configuration 1 not configured
tlyu commented 11 months ago

Thanks for the detailed updates! The OpenBSD results are very interesting.

It also sort of works as a console keyboard but the keystrokes are laggy and typing quickly mixes them up. Macros don't suffer from keystrokes getting mixed up, but they do lag heavily. I'm also unsure if openbsd has a similar tunable to load similar drivers.

Would you say the delay is about a quarter of a second per keystroke?

My current hypothesis is that OpenBSD sees that the boot keyboard is sending only padding, so is giving up on polling that endpoint entirely, causing the keyboard firmware to time out (250ms each) on sending those reports. This is kind of the opposite of the expected failure mode with these changes. My expectation was that by sending both the multi-report and the boot report, hosts that don't set boot protocol will cause the keyboard to time out sending the non-boot reports.

theAlexes commented 11 months ago

Would you say the delay is about a quarter of a second per keystroke?

yeah, it's approximately that.

OpenBSD

Looking deeper and comparing hidkbd.c against the dmesg, it looks like it didn't attach the boot keyboard and did attach the multi-report keyboard. Weird.

tlyu commented 10 months ago

@theAlexes could you please try the new PRs (hybrid boot keyboard) linked above? They use an alternative approach (boot and NKRO reports on the same endpoint) that might work better with OpenBSD. I would appreciate hearing your feedback.

theAlexes commented 10 months ago

@tlyu preliminary results: this approach works well on the FreeBSD machine without the tunable and the OpenBSD MacBook, both at the loader and once the OS has booted. Going to go grab the FreeBSD machine that has the tunable enabled.

theAlexes commented 10 months ago

@tlyu FreeBSD with the usbhid tunable enabled also works! Unfortunately we don't have any KVM switches to test with, but this approach at least satisfies our use cases.

tlyu commented 10 months ago

@theAlexes thanks for the testing! That is very helpful. It confirms a few things that I had deduced from source inspection. I'll update the issue description with some details of OS-specific things that we've discovered so far.

tlyu commented 10 months ago

Updated description with host-specific details.

tlyu commented 10 months ago

Updated PRs so they implement switching of whether the boot keyboard sends only Boot Protocol. Appears to work with preliminary testing.

tlyu commented 10 months ago

My summary of the current situation: we have an approach (hybrid boot keyboard) that leads to strict improvement in multiple "hard" platforms (FreeBSD, OpenBSD), and no known regressions yet. How do we want to expose this for more testing? I can clean up the patches to a reviewable condition in maybe a week or less.

theAlexes commented 10 months ago

just updated our quiet-click model-01. we'll run 'em through their paces on the machines and OSes in our possession — one we haven't tried was Haiku. looking forward to it :)

tlyu commented 10 months ago

@gedankenexperimenter thank you for your suggestion of making the NKRO report canonical, and constructing the boot report from it as needed. It wound up being simpler than trying to maintain the two reports independently.

theAlexes commented 10 months ago

@tlyu the changes as of yesterday work fine on FreeBSD (default), FreeBSD (usbhid), OpenBSD, and OS X preboot. We weren't able to get Haiku running, but we did try it on the North Coast Synthesis "Gracious Host", a boot-protocol-only implementation: https://northcoastsynthesis.com/products/msk-014-gracious-host.html — it works great there too.

One request in the boot-protocol switch code, though: Can you use B instead of 6 for Boot Protocol, because our layout doesn't have numbers in the top layer (we use TopsyTurvy to make the numbers appear under Shift)

tlyu commented 10 months ago

@tlyu the changes as of yesterday work fine on FreeBSD (default), FreeBSD (usbhid), OpenBSD, and OS X preboot. We weren't able to get Haiku running, but we did try it on the North Coast Synthesis "Gracious Host", a boot-protocol-only implementation: https://northcoastsynthesis.com/products/msk-014-gracious-host.html — it works great there too.

Thanks for the extensive testing! Did you not get Haiku running at all, or did you have problems getting it to talk to the keyboard?

One request in the boot-protocol switch code, though: Can you use B instead of 6 for Boot Protocol, because our layout doesn't have numbers in the top layer (we use TopsyTurvy to make the numbers appear under Shift)

Yes, I could do that. I could even make it compile-time configurable.

theAlexes commented 10 months ago

Haiku wouldn't boot on either of the victim machines, and then we realized we had an even more cursed USB host than that :)

Thanks for the willingness to experiment — I realize this is a "creative violation of the letter of the spec", but it definitely solved the FreeBSD pain point we were having with this keyboard.

tlyu commented 10 months ago

Updated the Kaleidoscope patch to default to B instead of 6 for boot protocol. Also made it configurable, and documented the mode indicator behavior.

tlyu commented 10 months ago

"creative violation of the letter of the spec"

I'm not so sure about that. The main thing of the current approach that might violate the letter of the spec is the bitmap NKRO report, which we (and many others) were already doing. A device detaching and re-enumerating with different descriptors is almost indistinguishable (from a protocol level) from it being physically unplugged and replaced with a different device. I guess you could argue that the VID/PID should be different when that happens, but I'm not sure that's explicitly required.

Relatedly, Windows 10 persists cached device descriptors across hibernation and bus resets, at least on some USB3 ports, if the device hasn't been physically removed (or maybe is merely at the same physical port), and has the same device descriptor as before.

theAlexes commented 10 months ago

Okay, we've tried one more cursed device, the BMOW ADB/USB Wombat, which is closed-source, supposedly contains a HID parser, but still, even with this patch, requires us to manually switch the keyboard to boot protocol only. I think this is because it attaches to the first keyboard it sees, and is thus pulling an OpenBSD, ignoring the "padding" in the boot report device, concluding that it doesn't have any keys, and not trying any of the other endpoints.

tlyu commented 10 months ago

Thanks for the additional test!

thus pulling an OpenBSD, ignoring the "padding" in the boot report device, concluding that it doesn't have any keys, and not trying any of the other endpoints

The new hybrid report has the boot report array marked as padding, with a bitmap report after. A complete HID implementation should know what to do with it. From the vendor page at https://www.bigmessowires.com/usb-wombat/ it looks like it's using an unspecified Microchip HID library with possibly limited report descriptor parsing capabilities. It might not know how to deal with bitmap key reports at all.

theAlexes commented 10 months ago

ah, right, we were confusing it with the previous patch. I'll poke the BMOW fellow and see if he can release a firmware that requests boot protocol.

tlyu commented 10 months ago

Another possibility is that it could be fully understanding the report descriptor, but only reading the first 8 bytes of the report for some reason, despite it being declared as a longer report.

tlyu commented 10 months ago

For anyone watching, I updated #1361 so it no longer requires the separate KeyboardioHID change, now that KeyboardioHID is merged into Kaleidoscope.