UltimateHackingKeyboard / firmware

Ultimate Hacking Keyboard firmware
Other
420 stars 66 forks source link

Key chatter #128

Closed linduxed closed 6 years ago

linduxed commented 6 years ago

This behaviour is present on firmware version 8.2.0. It could be that this is fixed by a higher version of the firmware (I'm guessing that c38114648a5c57cd7de2f3135e0fc05c7f82df98 could maybe help, for instance), but I'm currently unable to update, as a result of https://github.com/UltimateHackingKeyboard/agent/issues/691.

I'm experiencing frequent but seemingly irregular occurrences of double input from pressing certain keys. This causes misspellings such as "dotted" becoming "dottetd".

The key that seems to be doing this the most is (on QWERTY) the T key. Other keys that exhibit this behaviour are I, S, F, and Space (right half, next to Fn).

There are probably other keys that I've witnessed producing a double input, but I can't remember them off the top of my head.

If one was to focus on the most frequently misbehaving key: I'm guessing that my habits of how I press keys, combined with where T is positioned triggers some kind of edge case.

With the T key in particular I'm also noticing the occasional issue of the input being skipped (no T written), but this could be unrelated and I have to type more to confirm.

kour1er commented 6 years ago

@kbranch very interested in your findings!

Jopie01 commented 6 years ago

It can be a race-condition somewhere in the firmware or buffer overflows which also freezes the keyboard. For me using "SHIFT" or "MOD" together with a key gives sometimes chatter, but mostly "normal" keys on the right half. It can maybe also have something to do with the USB-reports when the USB-bus is busy or the system itself. When upgrading my GNU/Linux system I often see more chatter, because the system is busy upgrading packages.

mondalaci commented 6 years ago

I think the new debouncing code is pretty solid and certainly much better than the old code. According to my experience, there's a clear correlation between the debouncing intervalls and chatter. I'd think that the chatter is produced by a variety of physical factors such as humidity, dust, temperature, and the manufacturing conditions of the switches. These factors and the wear of the switches may change over time which can affect chatter.

@linduxed I'm surprised about the lack of registration when 50-50 is set. Haven't experienced this, and don't have an idea yet why it may happen.

@kour1er I'm not sure what else I can add. Can you experience the chatter on multiple OSes / machines? Shouldn't be OS dependent but I'm wondering.

@kbranch Logic analyzer measurements would be much welcome.

kour1er commented 6 years ago

So in an effort to provide more data, I typed the same paragraph with my UHK in a variety of different computers. In all cases, my UHK was set to 15-15 for press release as @mondalaci is using. As you can see from the test runs, I'm still experiencing quite a bit of chatter.

The test computers: Mac Pro running 10.13.6 Mac Mini running 10.13.6 Asus running Windows 7

I detail the connection method in the top of each test run. I include a control keyboard sample at the top (Kinesis Freestyle Pro.

Typing Test

mondalaci commented 6 years ago

@kour1er Your UHK seems to be the most problematic of all by far. I'll get in touch with you very soon and offer a replacement unit.

kour1er commented 6 years ago

@mondalaci Cheers and sorry to be your problem typer :)

mondalaci commented 6 years ago

@kour1er No problem at all! You're part of the pilot run, and I'm pretty sure we messed up something.

linduxed commented 6 years ago

@mondalaci Please note that I'm not saying the following to get a replacement keyboard:

Could my keyboard also be one that could be considered notably problematic? Should I try to do the typing test of @kour1er?

I'm mainly interested in finding some general test that people can do to determine whether they have a problematic board. I think it's in every owner's (and I assume UHK developer's) interest to be able to definitively know whether it's a hardware or software issue.

mondalaci commented 6 years ago

@linduxed Please do the same test as @kour1er with 50-50 settings. I'm interested about your findings. At this point, I don't have a better idea regarding testing UHKs.

kbranch commented 6 years ago

So I'm a little late because I got sidetracked last night, but I did get to break out the oscilloscope and logic analyzer tonight.

I started by removing my 5 key so I could get a good look at the entire rising edge when pressing it. I connected a 10k resistor between 3.3v and one side of the key and connected the other side of the key back to its regular spot on the board, where the controller can see it. This made the controller think multiple keys were being pressed when I press the 5 key, but that's not a big deal for a test.

From there, I hooked the key up to an oscilloscope to see just how noisy key presses are. I can take some screenshots if they're of any use, but they're not terribly interesting. Most presses took around 0.3 ms to go from solidly low to solidly high. The worst I ever saw was ~1 ms. I'd guess that I looked closely at a few dozen presses on the scope.

Next, I hooked up a logic analyzer so I could line up Switch Hitter's detected chatter with what's physically going on. I did my best to remember to set debounceTimePress and Release to 1 ms after power cycling. This got kind of murky because detecting "chatter" in Switch Hitter does not necessarily mean anything physically went wrong at all, but I'll try to summarize what I learned on that front:

And here are the conclusions that I'm reasonably confident of:

I didn't test 8.4.2 for as long as 8.4.0 (I was on 8.4.0 when I started), but I thought it was also passing the "carefully press <= 6 keys" test while I had my key removed. I think I'm noticing a little chatter as I type this with 8.4.2 and 5 ms debounce (press and release), though. Before today, I had been using 8.4.0 for quite a while with the default 50 ms debounce without noticing any issues, but I'll play around with settings and versions as I use it regularly this week to try to narrow that down before I pop a key back off to try the logic analyzer again.

I also stopped testing 8.3.3 early because I thought I caught it red handed. I swear I recorded it sending a key press that didn't register on the logic analyzer, but I must have misinterpreted the data - it's a legitimate key press. I'll need to try again later, it definitely failed the "carefully press <= 6 keys" test, I just missed recording it.

Just let me know if you'd like me to try anything else next time, I don't mind fiddling with it to get to the bottom of this.

kbranch commented 6 years ago

As I play with it today (without the logic analyzer), Switch Hitter has recorded extra key presses for keys that were definitely pressed for much longer than its threshold, using 8.4.0 and 8.4.2 with 5 ms debouncing. It's rarer than I would have caught in my testing last night, I'll have to hook multiple keys up next time.

In the two examples I have so far, the extra keystroke actually happened on the release. I have to admit that I was focusing on the key press last night - I took a few looks at the release without seeing anything nasty, but I'll have to look closer.

kbranch commented 6 years ago

A little more testing with the logic analyzer, this time with some actual chatter captured.

I didn't feel like desoldering a switch again tonight, so I changed my tactics a little and set the logic analyzer to look for low pulses between 6 and 16 ms. The firmware seems to poll every ~5 ms, so that means they key looks like it was released for only one or two polls before being pressed again.

I caught chatter on release twice - I've attached screenshots, Switch Hitter logs and the raw logic analyzer captures (use this software to view the captures).

I think the next step is to desolder a key again and take a better look at those release edges. A special firmware version for debugging that leaves the row lines high all the time may or may not help identify weird issues, though.

ChatterOnRelease.zip

Jopie01 commented 6 years ago

I digged a bit into the source and if @kbranch is right about the polling of ~5ms maybe there is some room for some changes. On Arduino I'm also using debouncers, but the polling speed is way faster (polling at 8 MHz, debouncing at 500 ms). To do so, I'm able to detect the rising or falling edge and start the debouncing right after. Also I get a more accurate result. So polling is running on 8MHz and when an edge is detected, the debounce start time is set. During the polling the debounce time is continuously checked. You have the debounce time now in the USB-reports, what if you put it in the keyscan function right?

Something completely different. I discovered that my switches seems to have two press states (maybe that's normal with mechanical switches). When I press the switch slowly around halfway through the character appears already on the screen. When pressing the key fully until it's stop, the character is repeated. Sometimes this happens, sometimes is doesn't.

kbranch commented 6 years ago

The interesting part to me is that the release bounce is apparently lasting >= 5 ms in some cases (15 ms between visible pulses in one of the captures I posted). Polling faster and using a smarter debounce algorithm might help, but this much bouncing seems unusual for Cherry switches. I'm very curious to see the full release waveform.

mondalaci commented 6 years ago

@kbranch Thanks for the detailed writeup! Very much appreciated.

The debouncer actually uses a millisecond timer, not a 5 ms timer, but I don't think it would make any difference.

You measured a release bounce that lasts for 15 ms which is larger than the 5 ms maximum specified bounce time of MX switches. This begs for some explanation.

I can imagine that the vast majority of the MX switches are within spec when they roll out from the factory, but a lot happens during their lifetime which can affect their bounce time. The switches get transported to a distributor, then to a manfufacturer where they're subjected to a lot of heat during wave soldering. Then they arrive to their final desination and get smashed by the user millions of times, experience temperature and humidity fluctuations, eat some dust, etc.

The bounce time of a switch is a result of its history. Based on a very small sample that @kbranch measured, it seems that the bounce time can be as high as 15 ms, but I'm sure it can be even higher.

As another example, I'm pretty sure something happened to @kour1er's UHK that resulted in extremely long bounce times. Being a pilot run UHK, my guess is that the operator of the selective wave soldering machine was tweaking process parameters, and the switches got too much heat and behaved funky as a result.

Given the above, I think the best strategy is to use the longest debounce time that doesn't cause usability issues. What usability issues? The UHK must be able to discern discrete keypresses, even when hit in rapid succession by humans. But how rapid is that? @kour1er send me a crazy finger breaking game ealier that should give us a good idea. I can go slightly above 300 (that is, if 300 wouldn't be the upper limit in the game), maybe about 330, but I surely couldn't go above 400 and I doubt any human could go above 500 or 600.

300 hits per 30s translates to 100ms between hits, and 600 hits per 30s translates to 50ms which is the current default debounce interval, so I'm feeling quite strongly that the current value is close to optimal.

Also, based on your and my observations, the new debouncer is way better than the old one, and we didn't receive almost any complaints about the old one either, maybe only about 1-2.

When using the new debouncer with the default 50-50 debounce values, if bounces happen, I'd assume it's the switches, not the debouncer in which case a return is justified.

Feel free to let me know if you think I'm missing something, guys.

@Jopie01 @linduxed Do your UHKs still produce chatter?

Jopie01 commented 6 years ago

@mondalaci Mine is still producing some chatter, but sometimes. Not as much as @kour1er. It seems having to do with temperature and humidity what @TorC8 suggests and how long the keyboard is used. When I stop using my PC I pull the plug so no connection any more with the electricity. When I start using the PC the keyboard is just working fine. No chatter at all. But after some time (30 - 45 minutes) it started to chatter. The first starts at the left half mostly with the top row, numbers and the qwerty row.

What I'm also seeing is that I can't press the 2 as long as the other numbers. It happens also with the 3, 4 and 5 but the 2 is better reproducible.

Like @mondalaci says, I'm also starting to think of a mechanical issue instead of software. So maybe it's a lose soldering, time to open up the keyboard and check the soldering.

mondalaci commented 6 years ago

@Jopie01 Interesting findings! Feel free to disassemble your UHK. Your warranty won't be void, and we'll provide a replacement unit if requested.

kbranch commented 6 years ago

@mondalaci - The 5 ms timing I was referring to was the rate at which keys appear to be polled, judging from the ~3 us pulses every ~5 ms in the logic analyzer capture.

While leaving the debouncing time at 50 ms isn't the end of the world, I would point out that we're concerned more with the minimum time between state changes than with the average time. I can't do much better than that ~10 Hz number over 30 seconds with one finger, but my time between release and press can commonly be around 20 ms, with occasional excursions down as low as about 12 ms.

Since I have to admit that this does seem to be in hardware at this point, the best approach might be faster polling combined with a smarter debouncing algorithm, like @Jopie01 suggested.

While the total bounce time I measured is comparable to the quickest time I've been able to change key states, it should be possible identify two key presses that are separated only by bouncing. Any bouncing that follows a signal that has been steady for longer than a certain threshold means the key has changed states (barring EMI issues that will have to be accounted for).

I'm not an expert by any measure, but I found an interesting post on key scanning and debouncing. The section at the beginning that describes (ab)using DMA for scanning the matrix may or may not be of interest - I'm not familiar at all with the capabilities of the microcontrollers in use in the UHK.

kour1er commented 6 years ago

@mondalaci new UHK arrived. Flashed with the latest 8.4.0 firmware (direct from the Github Agent version compiled locally), using the default 50/50

typing test 2

As you can see, ZERO BOUNCE :)

It's so great to have a manufacturer really wanting to make the perfect keyboard 👍

mondalaci commented 6 years ago

@kbranch I read the referenced article a while back. Not sure whether the DMA approach is viable on our MCU, but I rather wouldn't make the algorithm more complicated at this point, as due to the variation of switch bounce times, testing it can be a huge time sink, and the results may not much better. As the LSTM example suggests, it's easy to overengineer this problem. I'd rather look into it again once we shipped the modules and things settle down. I'll probably close this issue in the near future if nobody complains.

@kour1er So glad your new UHK is working well! Thanks for enduring the pilot run! I understand that it must have been frustrating.

Jopie01 commented 6 years ago

@mondalaci We can now be sure that it is a mechanical issue. I've disassembled the UHK and looked thoroughly at the PCB through a microscope, but couldn't find anything. But then I remembered keyboard not fully passing CE tests and discovered that the inserts and magnet on my UHK were not covered. So I covered only the magnet with a small piece of plastic (I don't have epoxy) on both sides and it's running now like clockwork!. It can also be a reason why nobody else is complaining about chatter and why @kour1er 's new keyboard is working perfectly.

If I may, I want to ask @kour1er and @kbranch if it is possible to do the same with their chattering keyboard to check if this can be the culprit. A strong magnet very near a PCB.

acarabott commented 6 years ago

I've started experiencing some chatter on my UHK: should this be assumed to be a hardware issue at this point?

If it helps track down the batch, my order number was 22936

kbranch commented 6 years ago

Interesting find, @Jopie01. I've actually been using my left half without a case (and thus without a magnet), just to make it easier with all the wires I have hanging off the PCB for testing with the logic analyzer. I just took my right half apart too, I'll see how it does. I'll try reassembling everything with some insulation in there too if I still get some chatter.

mondalaci commented 6 years ago

@acarabott Please upgrade to firmware 8.4.4 in Agent, and let me know how it works. Please note that this is a pre-release firmware version and may freeze once in a while. We're doing our best to fix the freeze bug. You can downgrade to 8.2.5 any time which is the latest stable firmware version.

acarabott commented 6 years ago

@mondalaci ok have upgraded. will report back in a couple of days

iprok commented 6 years ago

Chattering begin for me today. No times earlier (for some weeks). I haven't use my PC for some days. I'm on 8.4.2 firmware for now. Will flash 8.4.4 and report back after some test.

mondalaci commented 6 years ago

@iprok Upgrading from 8.4.2 to 8.4.4 will hardly improve things because the debouncer is unaffected. How bad is the chatter?

iprok commented 6 years ago

Today it has just repeated twice during last hour. So it seems to be little for me. And very rare. Seem like my keyboard is "tired". (Today was hard day with a lot of keyboard work). I've finished for today, so will see if chattering will go away after powering it of for a long time.

acarabott commented 6 years ago

@mondalaci I have been using 8.4.4 for a few days now. Definitely an improvement over 8.2.5 but still occasionally getting some repeats.

Jopie01 commented 6 years ago

After I covered both ends of the strong magnet I haven't had any key-chatter. I wasn't able to type strong passwords (capital, numbers, strange characters) when I wanted to login onto a remote server over a slow connection. Now I don't have any issue with it. I didn't change my behaviour or anything else. Just to mention, I'm using the UHK splitted.

mondalaci commented 6 years ago

@Jopie01 Yours is a very interesting finding that I wouldn't have ever thought of. Only the magnets of non-EU pilot run UHKs should not be covered from inside of the case. This should be solved for every other UHK, but I'm wondering whether we messed up something.

(Back in the days, our thinking was that CE is unreasonably strict, and it was a pain to cover magnets with epoxy, so we only did it for EU pilot run UHKs given that CE only pertains to EU units.)

@TorC8 is from the pilot run and located outside of EU, too, so the uncovered magnet issue might apply in his case.

@acarabott @iprok You guys are from mini batch 5 and 6 respectively which is quite close. Maybe something has happened around that time.

@kbranch @linduxed @kareltucek Still having this issue? If so, please share your mini batch id, which you can look up based on your order id and the delivery status page. Maybe there's a correlation here.

If you're reading this, and you're affected by this issue, please disassemble your UHK, and check if the ends of the magnets are exposed inside of the case, and report back. (Your warranty won't be void.)

Edit: Please hold on guys. Let's wait for @Jopie01's answer below. I don't want you to do extra work for no good reason.

kareltucek commented 6 years ago

@mondalaci nope, I was never trully affected by this issue.

mondalaci commented 6 years ago

Let me show you a couple of pictures guys, so that you can be sure what to look for.

It's important to note that we only sealed the right magnet after the pilot run, because only it was required by CE. This is the right magnet with its ends exposed inside of the case:

right-exposed

The following is also the right magnet, but this time sealed. The mold was modified, so the ends of the magnet is not exposed anymore.

right-covered

This is the left magnet unsealed. We didn't ever seal this because it wasn't required by CE:

left-exposed

@Jopie01 Are you from the pilot run? If not, the right magnet of your UHK should have already been sealed in which case you could only seal the the left magnet. Is this the case?

Jopie01 commented 6 years ago

@mondalaci Yes I'm from the pilot run. I only disassembled the left half because I had a lot of chatter coming from the left half. The left half is the same as on your last picture. I had some wrong 3D-printed parts lying around from PLA and 2 mm thick. With some Aceton I melted both together. The magnet is now completely sealed. See below. 18080005_1 Since I've done that, no chatter any more. It's strange that a small piece of plastic kind of fixes this. I'm also wondering if the steel plate is grounded to the PCB.

mondalaci commented 6 years ago

@Jopie01 This is very strange because according to the EMC tests, the magnet of the left half shouldn't be sealed, yet it did cause problems on your end. The steel plate is not grounded to the PCB. I'm not sure if it should be. The EMC tests didn't show any problems in this respect, although as your case suggests, this doesn't necessarily mean anything. Very nice fix by the way!

@acarabott @iprok @kbranch @linduxed According to my knowledge you guys experience chatter. If so, would you please disassemble your UHK, and seal the ends of the left magnet with drops of epoxy as on the following picture?

epoxied-magnet

Please also check the right magnet, although it should be sealed unless you're a non-EU pilot run participant.

I'm sorry for the trouble, and looking forward whether this resolve the issue on your side.

TorC8 commented 6 years ago

I'm just getting things settled again after a couple weeks away from my keyboard. Finally upgraded to 8.4.4, and checked my right half. The magnet is covered with epoxy (I'm guessing getting the replacement for the pogo pin issue put me in an effectively slightly later group there). I'll monitor how it behaves for a little while, and then cover the other magnet, if needed, and the matching steel plates.

If air conditions have anything to do with the issue, I'm pretty sure I'm right about at the top of the list for trouble. While looking at the magnet, I also found that my keyboard connection rods are getting rusty(!). Guess it's time for a little maintenance there.

mondalaci commented 6 years ago

Yes, your right magnet must have be sealed because of the replacement. Please cover the left magnet the next time as suggested. By "steel plates", I assume you mean the magnet counterparts. Feel free to cover them, but preferably only cover the left magnet first, then give it a try, and only cover the magnet counterparts afterwards one by one, so that we can isolate the cause of the issue.

Regarding the rust, by "keyboard connection rods", do you mean the pogo pins or the stainless steel guides that mechanically hold the two halves together?

TorC8 commented 6 years ago

OK, will do on staging covering the magnets and matching plates. I haven't noted chattering since I got back, but we'll see how the 50/50ms timing works for a bit before I cover anything.

Actually, I have suspected a couple dropped keypresses, but the only place I can reliably see it there's an audible difference in the sound of the right shift as opposed to the other keys. I'm also apt to hit it right on the edge, not close to the center, which may have an effect on behavior.

The rust is on the mechanical guide rods, not the pogo pins. I checked with a 10x loupe, just to be sure I was giving accurate information, and not because something got on the rods. I'm aware they are stainless, which is no doubt why it's just a little surface rust, and not an ugly mass of reddish brown. If it can rust, it will do so here, since a whole ocean is about a mile upwind of me.

kbranch commented 6 years ago

It does look like my right case is using the updated mold with the covered magnet. My left is uncovered.

I used both halves of the keyboard without any case at all this week, and I think the chatter was about the same as before disassembly. I never had it half as bad as some people in this thread, though - it's relatively common with the debouncer set at 1 ms, rare at 5 ms and nonexistent at 50 ms.

I'll cover the left magnet with epoxy, desolder all my logic analyzer wires and reassemble both halves tonight to see if that changes anything.

mondalaci commented 6 years ago

@kbranch Feel free to proceed as suggested, but if the chatter is nonexistent at 50/50ms at your side, I wouldn't consider this an issue due to the potential variation of switch properties described in https://github.com/UltimateHackingKeyboard/firmware/issues/128#issuecomment-412376765

kbranch commented 6 years ago

@mondalaci I'm still a little skeptical that as much bounce as I've recorded is normal for Cherry switches, but I haven't had much luck finding any real data one way or the other. Either way, it's clear that my complaint is pretty minor compared to others in this thread, and it could easily be unrelated. I'll report back if anything changes after reassembly, but otherwise I'll take my name out of the hat on this issue.

mondalaci commented 6 years ago

@TorC8 I've talked to András, my colleague about the rusty guides. He told me that initially, gloves weren't used during the assembly procedure of the steel guides, and an assembly worker had a rather acidic sweat which affected a couple of guides, resulting in rust formation on the guides over time.

The fix is to use a very fine sand paper to remove the rust. (Emphasis on very fine, because the guides must not be scratched.) Then apply silicone spray on the guides to avoid further rust formation. This should prevent the issue occurring again. We've been using gloves for a while, and we'll try to treat the surface of the guides in the future to completely eliminate the possibility of this occurring again.

We're sorry about this issue. This is one of the few issues we couldn't foresee, and yours the first feedback of its kind. Feel free to follow up about this at support@UltimateHackingKeyboard.com, so that we can keep this thread on topic.

linduxed commented 6 years ago

I will run some tests this evening, so we can compare my results with those of the other commenters here.

linduxed commented 6 years ago

Here's my test writing with 50-50 settings, on firmware 8.4.4:

https://gist.github.com/linduxed/ae4fe89719650cd4abd7d3c592951e0c

It's two texts and one key tapping test to check some of my previously problematic keys. As evidenced by the output, the problem is still there.

My order number was 22729, so the second mini-batch.

I have not yet tried opening up the keyboard, might do that during next week.

mondalaci commented 6 years ago

@linduxed Thanks for your feedback! I'm rather confused by your samples because in the first sample g is often omitted, and in the second sample g is often inadvertently inserted. Neither of these cases look like chatter, but more like a random communication problem. How's that?

kbranch commented 6 years ago

I can only speak for myself, but my chatter does often look like what @linduxed is reporting in that second sample. Since the bounce was often on the release event, there had often already been another key pressed by that point.

linduxed commented 6 years ago

My chatter has always manifested as both extra repetitions of input and omitted input.

As a sidenote I should mention that I use the Colemak layout, so in the linked text the most problematic letter ("g") is on the QWERTY T-key. My original post at the top of this thread details other keys that I've had problems with (I guess V could be added to that list).

mondalaci commented 6 years ago

@kbranch It's an interesting coincidence. Now that I examine @linduxed's sample again, I realize that the QWERTY T key is very close to the uncovered magnet. I think chances are fair that this is the culprit. I'm curious what will happen if the ends of the magnet will be sealed. I could use a typing sample from you too, so feel free to share.

linduxed commented 6 years ago

OK, I'll try this out during the coming week then. I'll also try to go down from 50-50 to something lower with the keyboard modification in place.

mondalaci commented 6 years ago

I wouldn't go below 50-50 yet. I think we should make sure in this issue that things are stable at 50-50, and later optimize the timing values in another issue.