maximkulkin / esp-homekit

Apple HomeKit accessory server library for ESP-OPEN-RTOS
MIT License
1.11k stars 170 forks source link

Automations become unreliable with lots of controllers #18

Closed renssies closed 6 years ago

renssies commented 6 years ago

I will start off by saying this is an edge case, but automations and notifications become unreliable if you have a lot of devices connected to HomeKit. I've just seen this happen at our office.

What happened:

I think this issue exists because esp-homekit currently only supports a max of 8 sessions while the specs call for a minimum of 16 sessions. I think that HomeKit has a mechanism that activates once 16 sessions has been reached, probably routing new sessions through the home hub or other clients. But since we never even allow 16 sessions, this mechanism never starts working.

Edit Jan 16 2018: Updated wording of pairing vs sessions

maximkulkin commented 6 years ago

So, here is a problem: ESP8266 has a very limited amount of RAM. We try our best to use minimal amount of ram, in small chunks (to avoid allocation failures due to fragmentation), one client at a time (so multiple clients do not consume memory). Right now the problem is that after you do send() over a socket, the payload is not sent immediately, but rather copied into LWIP output queue, so after you have processed another client the memory usage does not return to same level as when you have started processing that client. That memory will be eventually freed once information is sent. If you process clients fast enough and send large/many chunks of data (as happens when multiple clients re-establish sessions and re-request /accessories) you might run out of RAM.

We need to find a way to wait until socket send queue will be emptied while processing each client. That way it will be a bit slower but will avoid depleting RAM.

gongloo commented 6 years ago

I suspect that we can do just fine bumping HOMEKIT_MAX_CLIENTS to 16, especially now that we do blocking socket writes (avoiding lots of additional socket send buffering). I've tried it locally and haven't seen any issues on a Sonoff Basic (1MB) yet. Is there any reason at this point not to do that?

I'm not sure how large client_context_t is, but I suspect that we can handle 16 of them now. Might be worth preallocating and free-listing them while we're at it, but maybe that's also overkill. Thoughts?

maximkulkin commented 6 years ago

Yeah, you can definitely do that. Keep in mind that each client consumes ~1100-1200 bytes of RAM (mostly because we need a 1038 bytes of receive buffer to be able to receive and decrypt larges encrypted chunk possible, plus some housekeeping).

I will wrap that setting with #ifndef/#endif to allow changing it from build script without changing source code.

gongloo commented 6 years ago

Understood, but it also sounds like setting it to anything <16 is a breakage spec-wise. ~20K of RAM likely won't be an issue for most builds, so I kindly suggest bumping the default to 16. Happy to submit a trivial PR to this effect.

maximkulkin commented 6 years ago

Ok, I pushed a patch to change HOMEKIT_MAX_CLIENTS to 16 by default and also make it configurable through environment/Make variables.

maximkulkin commented 6 years ago

@renssies Are we good here?

renssies commented 6 years ago

@maximkulkin I will run a test again today. If it's good by the end of the day I will close the issue :)

renssies commented 6 years ago

My contact sensor example has been running fine for six hours in an environment with 22 HomeKit Controllers/Clients. Closing this. Before it would become unreliable after just an hour.