cb22 / macbook12-spi-driver

WIP input driver for the SPI touchpad / keyboard found in the 12" MacBook (MacBook8,1 + MacBook9,1)
GNU General Public License v2.0
300 stars 106 forks source link

Commit c90e breaks the driver on Macbook 9,1 #22

Closed Pamelloes closed 7 years ago

Pamelloes commented 7 years ago

Commit c90e4b4 causes the driver to no longer work on the Macbook 9,1. This commit changes the driver from using a reduced clock speed to having a delay. I'm currently investigating further.

cb22 commented 7 years ago

I should be getting my Macbook9,1 back this week, so I'll be able to give it a glance too. Let me know if you find anything 👍

roadrunner2 commented 7 years ago

@Pamelloes Can you elaborate? Does the keyboard not work anymore too? What happens if you just reduce the clock frequency again? Or if you increase the delay?

The delay given in the _DSM returned info is unit-less, so I guessed it to be 10µs based on testing on my MBP13,3, but it could be that that's to short. However, in #9 it was reported that this worked on a MacBook8,1, so I'm a bit surprised why it wouldn't work for you.

Also, did you rebuild the module dependencies (sudo depmod) after updating? I.e. have you verified the driver actually gets loaded? The above change introduces a new dependency on crc16, and some folks had issues because it's a module and wasn't pulled in due to the dependencies not having been updated.

Pamelloes commented 7 years ago

After commit c90e, neither the keyboard nor the touch pad responds to any form of input. I was initially testing just using make test, however I just did a full install including depmod and modprpbe and I have confirmed that both remain unresponsive. The really interesting thing, though, is that when on the latest commit, when I toggle caps lock on my USB keyboard the led toggles on the embedded keyboard—so clearly the driver is functioning to a certain extent. When I changed the speed back to 400000 hz everything remained and the led stopped working ("error writing to device 20 2 0 0" is reported on attempted led toggle). I experimented with different values of delay and nothing changed. Ultimately I have no idea what might be going on. Hopefully this helps a bit :)

Sent from my iPhone

On May 8, 2017, at 22:31, roadrunner2 notifications@github.com wrote:

@Pamelloes Can you elaborate? Does the keyboard not work anymore too? What happens if you just reduce the clock frequency again? Or if you increase the delay?

The delay given in the _DSM returned info is unit-less, so I guessed it to be 10µs based on testing on my MBP13,3, but it could be that that's to short. However, in #9 it was reported that this worked on a MacBook8,1, so I'm a bit surprised why it wouldn't work for you.

Also, did you rebuild the module dependencies (sudo depmod) after updating? I.e. have you verified the driver actually gets loaded? The above change introduces a new dependency on crc16, and some folks had issues because it's a module and wasn't pulled in due to the dependencies not having been updated.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

roadrunner2 commented 7 years ago

Wow, this is very interesting. The thing about the caps-lock led turning off and on is that that is triggered by an event from the input subsystem; i.e. the fact that hitting the caps-lock key toggles the led means that key events are being read by the driver, the driver is sending input events to the core, those are being relayed libinput, which in turn detects caps-lock and sends a led event back to the input core, that forwards it to the driver, and the driver sends an spi command. Given that this whole sequence is working, and that (at least) some key events can be read from the device and some commands sent to the device, I'm definitely a bit at a loss. I'm not sure this is an spi-communications issue then.

Can you try the following:

Pamelloes commented 7 years ago

I don't think I communicated quite right. None of the driver inputs worked–--neither the keyboard nor the trackpad. When I pressed the caps lock key on the laptop, nothing happened. However, when I pressed the caps lock key on the USB keyboard---not the laptop---the caps lock light on the laptop turned on. So, the write capabilities of the driver work but not necessarily the read capabilities. I'll try to run those commands later tonight and report back.

Edit: Ok, the device shows up, but when I hexdump It, nothing happens at all. Restoring to the last working commit, I correctly see three lines per keystroke.

Edit 2: I turned on DEBUG_ALL_READ and DEBUG_UNKNOWN_PACKET and I can confirm that the driver does read input from the keyboard whenever I press a key. However, the packet detected is 21061, not 288. I naively changed packet_keyboard to see if that would fix things but of course it didn't. I can provide some of the hex output if that would be helpful. One hint unnoticed was that after pressing several keys, the keyboard will totally flip out for about 15 seconds and then restart.

roadrunner2 commented 7 years ago

@Pamelloes Sorry for the confusion - you clearly wrote what was going on, but somehow I managed to miss that (I'll blame lack of sleep 😪 ). So that makes much more sense.

Regarding the packets received, thanks for turning on debugging. So the first two bytes indicate direction of the transfer (read/write) and the device (keyboard/trackpad). 21061 would be 0x45 0x52, which makes no sense (generally only one bit is set in either byte), so together with your hint about things flipping out this strongly indicates the transfer in the read direction at least is somehow messed up. However, that only appears to apply to async reads - sending the caps-lock-led command (as well as the trackpad init command) actually involves sending the command and then reading back a 4 byte status; I added checking of that status and it logs whenever that's wrong, as you noticed when you used the lower frequency. So if you aren't seeing these logs, then at at least that part is working fine.

Instead of changing the frequency, can you try increasing the cs-delay? Line 482 (or 472 if you're on HEAD) sets it to 10 times the dsdt reported value - try changing that to something higher, like 100 or 1000 times the reported value. You could even try a lower value like 0, which was essentially what was used before. But because the reads are a single-transfer message, I have trouble seeing how this could be the issue. Instead, it seems more like we may need a delay after receiving the interrupt and before doing the read. Need to think about this some more.

Pamelloes commented 7 years ago

Ok, I kind of got it to work. I noticed that 20 01 occurred consistently 10 bytes in so on line 731 I changed the memcpy to be from (applespi->rx_buffer + 10). The touch pad still does not work at all and the keyboard is super inconsistent, however the key I press does get typed. Often the key gets missed, though, and other times a key gets stuck down. Perhaps there's an extra ten bytes coming in somewhere else, too. I'll have to examine the protocol further.

On May 10, 2017, at 07:19, roadrunner2 notifications@github.com wrote:

@Pamelloes Sorry for the confusion - you clearly wrote what was going on, but somehow I managed to miss that (I'll blame lack of sleep 😪 ). So that makes much more sense.

Regarding the packets received, thanks for turning on debugging. So the first two bytes indicate direction of the transfer (read/write) and the device (keyboard/trackpad). 21061 would be 0x45 0x52, which makes no sense (generally only one bit is set in either byte), so together with your hint about things flipping out this strongly indicates the transfer in the read direction at least is somehow messed up. However, that only appears to apply to async reads - sending the caps-lock-led command (as well as the trackpad init command) actually involves sending the command and then reading back a 4 byte status; I added checking of that status and it logs whenever that's wrong, as you noticed when you used the lower frequency. So if you aren't seeing these logs, then at at least that part is working fine.

Instead of changing the frequency, can you try increasing the cs-delay? Line 482 (or 472 if you're on HEAD) sets it to 10 times the dsdt reported value - try changing that to something higher, like 100 or 1000 times the reported value. You could even try a lower value like 0, which was essentially what was used before. But because the reads are a single-transfer message, I have trouble seeing how this could be the issue. Instead, it seems more like we may need a delay after receiving the interrupt and before doing the read. Need to think about this some more.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub, or mute the thread.

roadrunner2 commented 7 years ago

I saw exactly this sort of thing too when the delay was missing, i.e. random data before the real data. So to me this further points to us reading the value too quickly after receiving the interrupt. Can you try if this simple patch to delay the read helps: read-delay.patch.txt (it's against current master). As it is it adds a 10µs delay before reading the keyboard/touchpad, based off two unknown settings in the dsdt which both specify 10µs, but try higher values if necessary (max is 65535).

Pamelloes commented 7 years ago

After applying the patch, it works perfectly!

roadrunner2 commented 7 years ago

Great. Thanks for your patience. Pull request #26 has this fix now.

cb22 commented 7 years ago

@roadrunner2 @Pamelloes I seem to be seeing the exact same thing (even the caps lock behaviour), even after #26 on my MacBook9,1 (took a bit longer to get it back than I would have liked, but anyways). I'll do some more digging and try and figure out what's going on.

roadrunner2 commented 7 years ago

@cb22 So on reads you're getting packets that have a few random bytes preceding the real data? If so, I'd try increasing the delay_usecs in the dl_t packet. How reliable are the caps-lock-button toggles? Any errors there at all?

cb22 commented 7 years ago

@roadrunner2 yeah. Caps Locks toggles have had no problems at all. I just tried bumping up the delay_usecs to various different values and it didn't seem to have much of an effect.

The packet I'm getting when reading (for example) at 8MHz is:

[ 2715.770095] applespi: 45 52 6d b3 20 20 20 20 20 02 00 00 00 00 56 00 10 02 00 7a 00 00 4c 00 02 00 02 fe 00 00 00 01
[ 2715.771657] applespi: 75 ea 20 04 61 67 a6 11 00 02 07 97 02 00 06 00 1e 00 00 00 00 00 01 00 10 20 00 00 05 00 00 00
[ 2715.773298] applespi: ed 08 a1 07 10 00 07 04 02 01 ad 09 89 07 ec 00 b8 00 9b 05 f8 03 0d 63 2e 01 ea 00 00 00 00 00
[ 2715.774907] applespi: 0f 00 00 00 c0 77 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2715.776544] applespi: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2715.778184] applespi: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2715.779824] applespi: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 2715.781458] applespi: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00

which seems to have a leading 45 52 6d b3 20 20 20 20. This leading bit is extremely constant between reads (the first two bytes are always identical, and the offset of the 'real' data is too).

roadrunner2 commented 7 years ago

@cb22 Just to clarify, that's just the delay_usecs on the dl_t transfer, right?

That leading data doesn't look like the end of a previous packet, so this isn't some out-of-phase issue. Can you try adding a sleep before issuing the async read? E.g.

        udelay(10);
        applespi_async(...);

The difference between this and the dl_t sleep is that the latter is done after cs is asserted, whereas this one would be before (I believe this executes in an interrupt context, so doing a udelay() here is definitely not a final solution...)

cb22 commented 7 years ago

I mentioned it elsewhere; this seems to have magically resolved itself on my 9,1 and I can't seem to reproduce it anymore.

I'm going to close this for now, until either myself or someone else can replicate.