maximkulkin / esp-homekit

Apple HomeKit accessory server library for ESP-OPEN-RTOS
MIT License
1.11k stars 170 forks source link

Need for Half-Paired concept #14

Open HomeACcessoryKid opened 6 years ago

HomeACcessoryKid commented 6 years ago

Today I witnessed that you have a 'trap' in your state machine. If for some reason (I have traced one) the verify step of the pairing process is not executed, then the device is considered paired and refuses to pair again.

Please see my repo for the use of the half-paired concept of how to solve it. Summary: only consider pairing official once the GET /accessories URL has been received (and maybe even deeper in the flow, like the identify that seems to be triggered always nowadays.)

>>> HomeKit: Configuring mDNS
>>> HomeKit: [Client 4] Successfully paired
>>> homekit_client_process: [Client 4] Finished processing
>>> HomeKit: [Client 4] Closing client connection

no verify step follows, and after 22 seconds, iPhone starts again with step1/3

>>> HomeKit: Got new client connection: 4
>>> homekit_client_process: [Client 4] Got 125 incomming data
>>> homekit_server_on_pair_setup: Pair Setup
>>> homekit_server_on_pair_setup: Free heap: 35896
>>> tlv_debug: Got following TLV values:
>>> tlv_debug: Type 6 value (1 bytes): \x01
>>> tlv_debug: Type 0 value (1 bytes): \x00
>>> HomeKit: [Client 4] Pair Setup Step 1/3
>>> homekit_server_on_pair_setup: Free heap: 35800
>>> HomeKit: [Client 4] Refusing to pair: already paired
>>> send_tlv_response: [Client 4] Sending TLV response
>>> tlv_debug: Got following TLV values:
>>> tlv_debug: Type 7 value (1 bytes): \x06
>>> tlv_debug: Type 6 value (1 bytes): \x02
>>> client_send: [Client 4] Sending payload: HTTP/1.1 200 OK\x0D\x0AContent-Type: application/pairing+tlv8\x0D\x0AContent-Length: 6\x0D\x0AConnection: keep-alive\x0D\x0A\x0D\x0A\x07\x01\x06\x06\x01\x02
>>> homekit_client_process: [Client 4] Finished processing
>>> HomeKit: [Client 4] Closing client connection
maximkulkin commented 6 years ago

Why would client drop verifying and retry?

HomeACcessoryKid commented 6 years ago

Thing is that in real life, it does. Our mission is to handle real life situations. This does not imply that if you have a way to avoid this situation by fixing the initial condition, that is fine. But for now, it does not hurt to be robust and take care that unless the phone got to the screen where the accessory is really recognised, it is not considered paired.

renssies commented 6 years ago

As it turns out, with an iPhone X this happens more often. If you're not looking at the screen the device will eventually lock, causing the pairing process to stop on the controller side, which causes the half-paired state.

I've had it twice now with an iPhone X, once in low power mode (so a really short device on duration) and once without.

maximkulkin commented 6 years ago

Hey, can you try same with official accessories? On Tue, Jan 16, 2018 at 02:55 Rens Verhoeven notifications@github.com wrote:

As it turns out, with an iPhone X this happens more often. If you're not looking at the screen the device will eventually lock, causing the pairing process to stop on the controller side, which causes the half-paired state.

I've had it twice now with an iPhone X, once in low power mode (so a really short device on duration) and once without.

— You are receiving this because you commented.

Reply to this email directly, view it on GitHub https://github.com/maximkulkin/esp-homekit/issues/14#issuecomment-357924553, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAqIQOmPLusdVb6Gcd4DgMG9x6nP1_mks5tLIANgaJpZM4RVqqs .

renssies commented 6 years ago

I just tried it with new out of the box Koogeek plug (P1EU) and I was able to reproduce it. Once it fails to pair, due to the iPhone falling to sleep (locking). The Koogeek starts flashing red for about 5-10 seconds, after that it's available for pairing again and pairs just fine.

Not sure how they solved the issue, I guess they just reset somehow?

HomeACcessoryKid commented 6 years ago

See above how I did it in my repo. However that could be improved on by not writing to flash until the confirmation is in.

maximkulkin commented 6 years ago

I understand that this is a problem, although I do not see an elegant solution to it. As a workaround: can you just make sure your device does not go to sleep when you pair? It's a ONE TIME operation initiated by user and it takes just under a minute.

The other thing is it would be nice to improve "reset" story.

HomeACcessoryKid commented 6 years ago

I have the following parts that might contribute to an elegant solution.

Whenever a pair-setup finishes successfully, you could keep the relevant information in a RAM variable struct as a cache and set a half-paired flag. On pair-verify you would always start to see if the info matches the cache first.

In the identify accessory routine (it seems to be guaranteed during pairing process) you could check the half-paired flag and then write the cache to flash. If the phone 'ordered' the identify, it seems sure that pairing was successful.

maximkulkin commented 6 years ago

I believe, accessory is not added to iOS until you review it on screen and press "Add" button. Which obviously happen after it has downloaded info and did identify routine. Thus, even if user manage to prevent device from sleeping while it was pairing but then did not approve addition and device went sleeping, it still won't add accessory (despite accessory having pairing info already).

renssies commented 6 years ago

@maximkulkin A accessory is added to the Default room if you skip the "Add" screen. As soon as it's paired it's in the HomeKit database and soon synced over iCloud, just confirmed this with Koogeek plug.

affonsoloyola commented 6 years ago

Had the problem naturally with iPhoneX. I think a simple timeout for the whole process to happen should be enough.And/or: If pairing is left in the middle. Should be ignored if same client restarts the process. As if never happened. Would that be possible? I’m still studying this whole thing (Apple docs and this code )

pbendersky commented 6 years ago

I think I'm having the same issue. I failed to pair, and now I get this log anytime I try to pair:

ip:192.168.1.47,mask:255.255.255.0,gw:192.168.1.1
>>> HomeKit: Got new client connection: 4
>>> HomeKit: [Client 4] Pair Setup Step 1/3
>>> HomeKit: [Client 4] Pair Setup Step 2/3
>>> HomeKit: [Client 4] Pair Setup Step 3/3
>>> HomeKit: Added pairing with 94F4C638-BA4A-41AB-B2E6-15ACAA9D056A
>>> HomeKit: Configuring mDNS
mDNS announcement: Name=AC Controllermd=AC-IRpv=1.0id=90:26:9C:DB:9A:BCc#=1s#=1ff=0sf=0ci=21 Port=5556 TTL=4500
>>> HomeKit: [Client 4] Successfully paired
>>> HomeKit: [Client 4] Closing client connection
>>> HomeKit: Got new client connection: 4
>>> HomeKit: [Client 4] Pair Setup Step 1/3
>>> HomeKit: [Client 4] Refusing to pair: already paired
>>> HomeKit: [Client 4] Closing client connection

Any idea on how to reset it so I can start the pairing process from scratch?

maximkulkin commented 6 years ago

You need to have some way to reset accessory. E.g. sonoff_basic accessory supports reseting accessory configuration if you do long press on it's button. Or you can just erase_flash and re-flash again (or even reset sectors that hold esp-homekit configuration without need to reflash everything, like esptool.py -p $ESPPORT -b 115200 erase_region 0x100000 4096)

pbendersky commented 6 years ago

I did an erase_flash (it's a Wemos D1 Mini, not a Sonoff). The log I attached is right after the erase_flash. I'm using iOS 12 if that helps...

HomeACcessoryKid commented 6 years ago

I still consider that a fix on the accessory side is possible and desired. It is clear this happens ever so often in real life and is very annoying.

@maximkulkin Do you want us to help designing the solution in more detail?

maximkulkin commented 5 years ago

@HomeACcessoryKid Sure, let's talk about that.

HomeACcessoryKid commented 5 years ago

With the progress of time, I have observed that nowadays there is always an 'identify' command sent from the phone to the accessory once pairing completes. If we use this as a criterium to switch from 'pairing' to 'paired' then I think this issue would be solved.

maximkulkin commented 5 years ago

Ok, makes sense. Do we have a repro steps or just implement that blindly? I sure can use some client library to emulate the whole process, but if there is a natural way to reproduce this problem, that would help. Maybe like "device locks screen while pairing"?

HomeACcessoryKid commented 5 years ago

I didn't try to find a way to force it locking up, but your suggestion to lock the screen seems like a good candidate.
Maybe we (I) should first practice to make it fail to learn the way to break it down.
If it is not easy to force it with my older model (I still have an iPhone SE ;-) we need help from others...

d4rkmen commented 4 years ago

Pairing is still a weak place for this library. One of two times pairing step 1,2 connection dropps and process restarted (try 2), but faced "Another pairing in progress" error, then pairing again (try 3), on this step mDNS error allocating 1460 bytes appear. And need to reset the device. I believe the slow CPU and long responce time make the controller (iphone) to break pairing process but not 100% sure

maximkulkin commented 4 years ago

Oh thank you for summing that up.

mDNS allocations is not a bug, it’s a limitation put on the size of requests it can process. Most likely we do not need to process those requests anyways.

A lot of times people writing their own firmwares liberally allocate memory like they have infinite amounts of it and then complain about failed pairings. This is embedded environment boys.

Those “Another pairing is in progress” reports is kind of new and I have heard about it only from one person so far (you’re the second one) and it sounds like something is weird in your environment (why would same device start multiple pairing attempts at the same time?!). Most people I know about (including me) have ZERO problems with pairing. Try applying patch from outstanding PR by @maccoylton, he sent it because he was having similar issues, and let me know if that works for you.

peros550 commented 4 years ago

I also have faced the above message “Another pairing is in progress“

why would same device start multiple pairing attempts at the same time

It could happen when pairing for whatsoever reason takes too long to complete , and the user can’t wait for the first attempt, decides to cancel the process and tries again from the beginning. Then, in the second attempt that message may show up.

To add in the conversation, I can recall that during one of my tests I achieved to block a device from HomeKit. No matter what, HomeKit reported that my device was already paired although the device were not (message : Refusing to pair: already paired)

I finally was able to pair the device successfully by >

A) changing its mDNS name AND B) change its HomeKit pairing code.

maximkulkin commented 4 years ago

It could happen when pairing for whatsoever reason takes too long to complete , and the user can’t wait for the first attempt, decides to cancel the process and tries again from the beginning. Then, in the second attempt that message may show up.

Well, then it is correct and expected behavior. You cancel process, you'll still have to wait. No cheating there. I believe @maccoylton had a different problem when controller tried to connect, gave up and reconnected without any user intervention.

maccoylton commented 4 years ago

Thats correct Max, my issue was a second connection from the IOS device attempting to pair without any user intervention.

maximkulkin commented 4 years ago

To add in the conversation, I can recall that during one of my tests I achieved to block a device from HomeKit. No matter what, HomeKit reported that my device was already paired although the device were not (message : Refusing to pair: already paired)

Yes, that was the original problem: if you pair and due to long duration of the process controller (iPhone) goes to sleep and unloads Home.app when acccessory was in final stage (step 3) of pairing, then accessory thinks that it all was completed, but iPhone loses state due to unload and does not have that pairing.

maximkulkin commented 4 years ago

Oops, wrong button, sorry.

peros550 commented 4 years ago

To add in the conversation, I can recall that during one of my tests I achieved to block a device from HomeKit. No matter what, HomeKit reported that my device was already paired although the device were not (message : Refusing to pair: already paired)

Yes, that was the original problem: if you pair and due to long duration of the process controller (iPhone) goes to sleep and unloads Home.app when acccessory was in final stage (step 3) of pairing, then accessory thinks that it all was completed, but iPhone loses state due to unload and does not have that pairing.

question is why is it taking so long to pair? I can recall pairing with Sonoff basic examples, 1 year ago, was more like instantaneous. It's quite long when I tried a Sonoff again, but nowadays pairing in my cases takes like 20-30sec.

maximkulkin commented 4 years ago

It was always ~30 seconds on ESP8266. On ESP32 it is about ~3-5 seconds.

d4rkmen commented 4 years ago

It was always ~30 seconds on ESP8266

did u ever tested mbedtls for this? interesting to compare

maximkulkin commented 4 years ago

I do not believe in fairies and besides the whole process is hard to setup and get right to do experiments for fun.

Oh yeah, also, mbedtls is just TLS implementation, it has no elliptic curves, SRP or hashes.

d4rkmen commented 4 years ago

I finally was able to pair the device successfully by >

I agree with You. The library missing some sort of pairing reset procedure. If user removes accessory offline reset will be required.

RavenSystem commented 4 years ago

Library has an API function to reset HomeKit Accessory ID: homekit_reset();

d4rkmen commented 4 years ago

Oh yeah, also, mbedtls is just TLS implementation, it has no elliptic curves, SRP or hashes.

HomeKitADK uses mbedtls, yes they implement custom SRP in the file HAPMbedTLS.c

homekit_reset();

Opps. Missed it then.

peros550 commented 4 years ago

Library has an API function to reset HomeKit Accessory ID: homekit_reset();

But if the user removes accessory offline, while accessory has no power, how the code running in the accessory could know to run the homekit_reset()? Is it possible for the accessory to be aware of such situation?

d4rkmen commented 4 years ago

how the code running in the accessory could know to run the homekit_reset()?

If user faced problems connecting the accessory, he should perform reset procedure (like long press for some button etc) and all pairings should be reset then

maximkulkin commented 4 years ago

But if the user removes accessory offline, while accessory has no power, how the code running in the accessory could know to run the homekit_reset()? Is it possible for the accessory to be aware of such situation?

If the user f*cks it up - he f*cks it up, nothing you can do about it.

peros550 commented 4 years ago

This sounds like a trap. There are users who have not a detailed knowledge of how homekit works. They might delete accessory by mistake. My 4 years son deleted an app the other day... If the accessory is i.e. a smart bulb with no physical buttons on it, there must be a way for the framework to survive these situations and reset itself.

Edit: My message above may sound harsh. Of course, I'm not in position to know if auto reset is technically possible or not, I just brought an extreme example to make my point. Of course, changing a firmware on a smart bulb has its risks on its own.

If possible, it would be best to have this function, if of course is not in contract with something else.

RavenSystem commented 4 years ago

@peros550 Firmware developer must implement methods to reset device.

For example, a Koogeek lightbulb (HomeKit certified device) has not buttons. If you remove it from HomeKit when device if power off, you have a problem... well, no. To reset lightbulb simply you must power on and off quickly several times.

Each device is different: some have buttons, some have not. Important thing is having a function to reset Homekit ID.

HomeACcessoryKid commented 4 years ago

how the code running in the accessory could know to run the homekit_reset()? Is it possible for the accessory to be aware of such situation?

Yes, because you can use wget or curl to call the url that does the reset. It was mentioned here already

d4rkmen commented 4 years ago

It was always ~30 seconds on ESP8266

did u ever tested mbedtls for this? interesting to compare

Have report for HomekitADK takes 6s for pairing on esp8266 (mbedtls powered), so, probably it worth to try it...

maximkulkin commented 4 years ago

@d4rkmen Who reports that?

maccoylton commented 4 years ago

It was always ~30 seconds on ESP8266 ...

@d4rkmen what PAL was used for that is it public ally available?

Sent with GitHawk

d4rkmen commented 4 years ago

@maccoylton update: 6s for longets request and 10s for total pairing esp8266 the repo is here https://github.com/mongoose-os-libs/homekit-adk

d4rkmen commented 4 years ago

About the issue subject: finally found the nature of this problem (it also happened to me sometimes) when it sais - pairing succesfully, but Home app still not added the accessory: this is again mDNS. The thing is, we setup mDNS immediately after sending pair-setup response. Sometimes mDNS changed even before the controller propertly proceess the response message. And it resets the context and keeps pairing again. In HomekitADK they config mDNS on the encrypted message handler, so the pairing is infact completed and we can switch mDNS to non-discoverabe. Tested this thing successfully

EuVan commented 4 years ago

Hi all. Do you have any solution or workaround for this issue? I got similar behavior with one of my devices. In device log after step 3/3 device reports mDNS activation, and after successful pairing and verification it removes pairing due to new client connection. iOS device was unlocked and Home app run during pairing. The only suspicious thing - I previously removed this device while Home app was reporting that device don’t respond. I don’t understand who triggers such behavior, because I tried to erase flash and fill new firmware, but it doesn’t help.

d4rkmen commented 4 years ago

Hi all. Do you have any solution or workaround for this issue? I got similar behavior with one of my devices. In device log after step 3/3 device reports mDNS activation, and after successful pairing and verification it removes pairing due to new client connection

This is the way it happened with me too. Try out my PR (alt mdns impl+ paired status update @ first crypted message from the controller)

maximkulkin commented 4 years ago

Are you guys building it with idf.py tool or CMake? I recently found out that CMake config is broken. Try building it with Make

EuVan commented 4 years ago

Are you guys building it with idf.py tool or CMake? I recently found out that CMake config is broken. Try building it with Make

I always build with Make but get same results. Why do you suppose the issue appears in build time?

maximkulkin commented 4 years ago

I just have a person contacting me via email with a problem that ESP32 accessory was loosing all pairings after a reset. And the reason was incorrectly set (actually, hardcoded incorrect value) HOMEKIT_SPI_FLASH_BASE_ADDR in CMake config. And the way it behaved is quite strange as it did not fail writes to incorrect address and subsequent reads worked correctly, but after reset everything was lost. So it makes me think it somehow cached in memory. But I do not have any confidence it will stay in memory indefinitely, so it might very well exhibit behavior above.

Can you check logs on which address it uses to write HomeKit data to? You can see message like

Formatting HomeKit storage at 0x200000

on a freshly erased and flashed module.

EuVan commented 4 years ago

I just have a person contacting me via email with a problem that ESP32 accessory was loosing all pairings after a reset. And the reason was incorrectly set (actually, hardcoded incorrect value) HOMEKIT_SPI_FLASH_BASE_ADDR in CMake config. And the way it behaved is quite strange as it did not fail writes to incorrect address and subsequent reads worked correctly, but after reset everything was lost. So it makes me think it somehow cached in memory. But I do not have any confidence it will stay in memory indefinitely, so it might very well exhibit behavior above.

Can you check logs on which address it uses to write HomeKit data to? You can see message like

Formatting HomeKit storage at 0x200000

on a freshly erased and flashed module.

Checked this and found following: HomeKit: Formatting HomeKit storage at 0x100000. Also I tried to fetch PR from d4arkmen and it helped. I tried to pair only once which is not representative enough but I previously spent two evening trying to solve this issue and was happy to found any solution to continue development. If I find any side effects or issues I'll give you to know. One strange thing I observed is that when I flashed another module with same firmware and turned it on, Home app could not connect to currently paired device reporting it is inaccessible. I checked accessory ID's in logs and they were different. I previously met this issue from time to time so I don't think it is related to PR I fetched, but it could be related to mDNS and root cause of the issue we discuss.