maximkulkin / esp-homekit

Apple HomeKit accessory server library for ESP-OPEN-RTOS
MIT License
1.1k stars 169 forks source link

"not responding" on some iDevices (ESP32-based) #78

Closed DietmarHoch closed 5 years ago

DietmarHoch commented 5 years ago

Hi Maxim,

based on your repo "esp-homekit-demo" i wrote an code for a garagedoor with the ESP32. please see my fork of your code. I know, this is basically not your coding style, but it works a while. (please feel free to optimize...)

The Problem occurs after a while, 10min to 1hour. I have 3 devices with the home app, after some time one device is disconnected (in the log) with "no response" in the home app. this happens to all diverse after a while with no coming back, except with a reset. maybe it is not a issue with your code, but could you take a look at my log, please? (search for "HomeKit: [Client 58] Closing client connection") Garage_terminal_Client_Disconnect.txt I don't know why the Client Numbers are that high.

hint: I work with the latest (master) of the Espressif IDF and all my iDevices are on beta (12.2)

thanks for the great work

Dietmar

maximkulkin commented 5 years ago

Can you check mDNS advertisements at the time accessories become unavailable?

peros550 commented 5 years ago

How could somebody check mDNS advertisements?

DietmarHoch commented 5 years ago

Hi Maxim, do you mean this?:

Dietmars-MBP-2:~ dietmar$ dns-sd -B _hap._tcp local Browsing for _hap._tcp.local DATE: ---Tue 05 Mar 2019--- 6:26:02.940 ...STARTING... Timestamp A/R Flags if Domain Service Type Instance Name 6:26:02.941 Add 3 5 local. _hap._tcp. VOCOlinc-LS1-0c3e37 6:26:02.941 Add 3 5 local. _hap._tcp. Garagentor-4582 6:26:02.941 Add 3 5 local. _hap._tcp. Homebridge pi3-CAD8 6:26:02.941 Add 3 5 local. _hap._tcp. Mobotix Door Jena-B6EE 6:26:02.941 Add 3 5 local. _hap._tcp. raspi cam-83BA 6:26:02.941 Add 3 5 local. _hap._tcp. Mobotix M15 Jena-D3B0 6:26:02.941 Add 2 5 local. _hap._tcp. raspi cam neu-DDD7

Dietmars-MBP-2:~ dietmar$ dns-sd -L Garagentor-4582 _hap._tcp local Lookup Garagentor-4582._hap._tcp.local DATE: ---Tue 05 Mar 2019--- 6:49:22.236 ...STARTING... 6:49:22.237 Garagentor-4582._hap._tcp.local. can be reached at Garagentor-4582.local.:5556 (interface 5) ci=4 sf=0 ff=0 s#=1 c#=1 id=45:82:20:F3:C4:9A pv=1.0 md=ESP32_GDO

DietmarHoch commented 5 years ago

Hi Maxim, it is possible to implement more debug information in the hap code? maybe some data get corrupted and Homekit block the device if it sends too much wrong data. The Information in the EVE app hints to that.

IMG_2002

if a device is offline than the error-message is different: IMG_2003

DietmarHoch commented 5 years ago

Update: the issue is also there with the LED-Example and the issue is related to my existing Homebridge server on a Raspberry with the Platform Plugin "Alexa"

if I remove the Alexa-things from the home bridge config, then the ESP32 is stable here are the lines I had to remove: "platforms": [ { "platform": "Alexa", "name": "Alexa", "username": "....", "password": "....", "pin": "111-11-333", "refresh": 15 },

don't know why the esp32 is not ignoring this (added more debug to the server.c), the authorization is different to the garage or to the LED-Example:

homekit_client_process: [Client 62] Got 165 incomming data homekit_client_process: [Client 62] DATA not crypted !!! homekit_client_process: [Client 62] incomming data: GET /accessories HTTP/1.1 Content-Type: Application/json authorization: 111-11-333 connection: keep-alive host: 192.168.178.50:5556 accept: application/json

hope this helps others... i don't need the Alexa plugin, but could you check why this happens? maybe other plugins in homebridge are affected also. here is the plugin: @northernman Alexa Skill frontend for Homebridge homebridge-alexa v0.1.72

maximkulkin commented 5 years ago

Not sure I understand what I am seeing here.

DietmarHoch commented 5 years ago

the problem is that a client (Homebridge on Raspberry Pi) connects to the ESP32 and leads to a disconnect of other existing clients by sending data with length 0. the process in the server.c "static void homekit_client_process(client_context_t *context)" destroys the connection with "context->disconnect = true;" because of the "data_len == 0" not understands why this client is not ignored, it sends a wrong authorization key, in my case 111-11-333

is it possible to implement a debug line to find out the IP and maybe more information about the client who sends the zero message? also, why is a client disconnecting with the first send of a zero message? what would happen when this message is ignored and leave the client connected?

sorry for the poor English, hope you understand

maximkulkin commented 5 years ago

So, usually when you get notified (through select()) that a socket is ready for reading (meaning there are bytes available to read) and then you read 0 bytes that means that peer has closed socket on it's end.

DietmarHoch commented 5 years ago

ok, i understand that. could you please check if the issue is on your hardware with a ESP32 Chip also? I disconnected my raspberry, but the issue is always reproduceble after a few hours. sometimes clients send 0 data, sometimes a negative value (-1) which leads to error with code 113

maximkulkin commented 5 years ago

I do not think getting 0 or -1 results from read() call is a problem. Getting 0 usually means that client on the other end has closed connection. -1 with errno == 113 is "Host unreachable" situation which usually happens when client got disconnected from network without caring to close connection to accessory. Usually happens when somebody walks away from WiFi network or just puts iPhone to sleep (locks it). Neither of the two leads to "unreachable" state.

On the other hand, sending incorrect data might cause this. So to make sure, what firmware are you using?

DietmarHoch commented 5 years ago

Hi Maxim, if you mean firmware, you asking for iOS or Mac OSX version? I am on latest beta from yesterday now on both systems, iOS 12.2 beta 6 and macOS Mojave 10.14.4 beta 6.

thanks for help so far

maximkulkin commented 5 years ago

No, I'm asking what did you flash into ESP8266.

DietmarHoch commented 5 years ago

i am using the ESP32, using the master branch of the Espressif IDF

DietmarHoch commented 5 years ago

could you please explain to me what triggers the server in the lines?:

int triggered_nfds = select(server->max_fd + 1, &read_fds, NULL, NULL, &timeout); if (triggered_nfds > 0) { if (FD_ISSET(server->listen_fd, &read_fds)) { homekit_server_accept_client(server); triggered_nfds--; }

        client_context_t *context = server->clients;
        while (context && triggered_nfds) {
            if (FD_ISSET(context->socket, &read_fds)) {
                homekit_client_process(context);
                triggered_nfds--;
            }

the "triggered_nfds" is called (true) when my esp32 fails don't know if it helps you to understand the problem

DietmarHoch commented 5 years ago

84

pbendersky commented 5 years ago

Can you check if #85 fixes it for you? It did for me.

DietmarHoch commented 5 years ago

Can you check if #85 fixes it for you? It did for me.

I did run an test over 15 hours with 2 ESP32 and 4 clients, the fix is working for me also. would be great if Maxim merge this changes into the master.

Thanks for find the answer to the issue

maximkulkin commented 5 years ago

Changes are there