Closed nonameplum closed 4 years ago
Additional info:
$ /Volumes/esp8266/esp-open-rtos/utils/addrsource 0x4021be45
0x4021be45 is in write_characteristic_json (/Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:419).
414 characteristic_format_events = (1 << 4),
415 } characteristic_format_t;
416
417
418 void write_characteristic_json(json_stream *json, client_context_t *client, const homekit_characteristic_t *ch, characteristic_format_t format, const homekit_value_t *value) {
419 json_string(json, "aid"); json_uint32(json, ch->service->accessory->id);
420 json_string(json, "iid"); json_uint32(json, ch->id);
421
422 if (format & characteristic_format_type) {
423 json_string(json, "type"); json_string(json, ch->type);
Line 419 of "/Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c" starts at address 0x4021be42 <write_characteristic_json+30> and ends at 0x4021be51 <write_characteristic_json+45>.
$ /Volumes/esp8266/esp-open-rtos/utils/addrsource 0x4021c600
0x4021c600 is in send_client_events (/Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:821).
816
817 client_event_t *e = events;
818 while (e) {
819 json_object_start(json);
820 write_characteristic_json(json, context, e->characteristic, 0, &e->value);
821 json_object_end(json);
822
823 e = e->next;
824 }
825
Line 821 of "/Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c" starts at address 0x4021c600 <send_client_events+116> and ends at 0x4021c605 <send_client_events+121>.
@maximkulkin Could you take a look if this might be a HomeKit bug? What I can do more to investigate the issue?
@nonameplum there's no HomeKit bug. I've build several devices and the run all perfectly. I would recommend starting with a basic setup (eg LED example) and then work your way up.
I ran many examples. Just started getting this issue on garage example from time to time. I'm not sure what might be the reason.
@nonameplum okay, so this issue happens only from time to time you say. What changes when you have this issue?
Seems like they might be some correlation of the timing as at that time few notifications goes to the HomeKit that door state has changed.
@nonameplum I saw that you already have consulted @g3ntleman in #144 I wil have a look If i can find anything I will come back to you.
@nonameplum So this is the hardware you use?
and which power supply?
Yes this is exactly my device. I use 12VDC power supply.
@nonameplum arendst/Sonoff-Tasmota#391
@AchimPieters I suppose this is not related to my problem. I don't have AC in my case. My Sonoff SV is only connected to power source 12VDC and the relay that is on the board shorts the circuit in my case to give signal to garage door opener to open/close that's it. I have stable ~3.3V on gpio.
I also pushed to my fork the changes with an additional accessory (light) that should light on/off when the door are opened/closed. Seems like with my code I get the crash way more often as it has to send three homekit notifications at reed switch interrupt.
My code is here: https://github.com/nonameplum/esp-homekit-demo/tree/my_garage/examples/garage
I also configured by @maximkulkin advice udp logging https://github.com/maximkulkin/esp-homekit/issues/88#issuecomment-520604076
But I'm able to reproduce it both with device connected via UART or standalone with 12VDC and UDP logging.
Fatal exception (9):
epc1=0x4021d32b
0x4021d32b: homekit_server_process_notifications at /Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:3159
epc2=0x00000000
epc3=0x4020e63b
0x4020e63b: prvIdleTask at /Volumes/esp8266/esp-open-rtos/FreeRTOS/Source/tasks.c:4820
excvaddr=0x3fff3613
depc=0x00000000
excsave1=0x4021d326
0x4021d326: homekit_server_process_notifications at /Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:3159
Registers:
a0 4021d326 a1 3fff8690 a2 3fff3468 a3 3fff3613
0x4021d326: homekit_server_process_notifications at /Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:3159
a4 3fff346b a5 00000000 a6 3fff3613 a7 00000000
a8 00000003 a9 00000090 a10 00000000 a11 00000000
a12 3fff3468 a13 3fff3138 SAR 0000001a
Stack: SP=0x3fff8690
0x3fff8690: 3fff3613 3fff1f84 00000000 00000005
0x3fff86a0: ffffffff 3ffe9200 00000000 40105ee6
0x40105ee6: _free_r at /home/src/esp/newlib-cygwin/build/xtensa-lx106-elf/newlib/libc/stdlib/../../../../../newlib/libc/stdlib/nano-mallocr.c:433
0x3fff86b0: 00000001 3fff3138 3fff52f8 00000000
0x3fff86c0: 00000001 00000004 3fff2ff0 4021e8d1
0x4021e8d1: homekit_run_server at /Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:3267
(inlined by) homekit_server_task at /Volumes/esp8266/esp-homekit-demo/components/common/homekit/src/server.c:3425
0x3fff86d0: b4150230 00000000 30303030 30303030
0x3fff86e0: 00000001 00000000 00000000 00000000
0x3fff86f0: 00000000 00000000 000000b6 00000000
0x3fff8700: 3fff2ff0 3fff3008 000000b6 00000000
Free Heap: 30356
_heap_start 0x3fff2520 brk 0x3fffa954 supervisor sp 0x40000000 sp-brk 22188 bytes
arena (total_size) 33844 fordblks (free_size) 8168 uordblocks (used_size) 25676
Exception 9 says unassigned address. Check the parameters you are passing too notify are correctly initialised and assigned.
It happens rarely but when current_door_state_set
is called. (modified garage example https://github.com/nonameplum/esp-homekit-demo/blob/60437fa101ca7eabd1faf6cdd29b360c599ebe3e/examples/garage/garage.c#L265)
I put this part of code for the convenience:
...
homekit_characteristic_t current_door_state = HOMEKIT_CHARACTERISTIC_(
CURRENT_DOOR_STATE, HOMEKIT_CHARACTERISTIC_CURRENT_DOOR_STATE_CLOSED,
.getter=gdo_current_door_state_get,
.setter=NULL
);
...
void gdo_current_door_state_notify_homekit() {
LOG("");
homekit_value_t new_value = HOMEKIT_UINT8(_current_door_state);
LOG("!> Notifying homekit CURRENT DOOR state: '%s' [%s]", state_description(_current_door_state), current_door_state.description);
homekit_characteristic_notify(¤t_door_state, new_value);
}
void gdo_target_door_state_notify_homekit() {
LOG("");
homekit_value_t new_value = gdo_target_door_state_get();
LOG("!> Notifying homekit TARGET DOOR state: '%s' [%s]", state_description(new_value.int_value), target_door_state.description);
homekit_characteristic_notify(&target_door_state, new_value);
}
void current_door_state_set(uint8_t new_state) {
LOG("New state: %d, old state: %d", new_state, _current_door_state);
if (_current_door_state != new_state) {
_current_door_state = new_state;
gdo_target_door_state_notify_homekit();
gdo_current_door_state_notify_homekit();
}
}
If I go to the part of the homekit's framework server.c
line 3159
as the crash log points to it refers to:
void homekit_server_process_notifications(homekit_server_t *server) {
client_context_t *context = server->clients;
while (context) {
characteristic_event_t *event = NULL;
if (xQueueReceive(context->event_queue, &event, 0)) {
// Get and coalesce all client events
client_event_t *events_head = malloc(sizeof(client_event_t));
events_head->characteristic = event->characteristic; // here the crash potentially happen based on the crash log
homekit_value_copy(&events_head->value, &event->value);
events_head->next = NULL;
homekit_value_destruct(&event->value);
free(event);
...
Seems like event
might be NULL
or corrupted? Not correctly received from the xQueueReceive
?
I'm not sure if this is because of the bug on my side or the framework has an issue.
@maximkulkin Could you take a look?
Ok, this is a rare topic, but I have already seen that before. Exception 9 is for "unALIGNED" access. That means that you're probably trying to access 32bit value at address that is not 32bit aligned.
@nonameplum to answer your question: this is probably a framework problem, not yours, but since you have a (although rare) repro, I will need your help to debug that. Can you add printf statements into homekit_server_process_notification() function inside if (xQueueReceive()) { }
block to print pointer values for event
and events_head
:
DEBUG("event = %p", event);
client_event_t *events_head = malloc(sizeof(client_event_t));
DEBUG("events_head = %p", events_head);
In meantime, I think that current garage door example is pretty messy and I had a cleaned up version laying around for quite some time, but I didn't merge it because I do not have setup to test it. If you could test it, I think it would be beneficial for everybody including you. I'm filing it in a separate branch called garage-door
No problem, I will put the logs and when I will get the crash again I will update this thread.
I will check the garage-door
too.
@maximkulkin I checked the garage-door
. There is an issue that the homekit app presents the opening/closing state without finally resolving into the open/close state. I checked it briefly. It might be that notification via homekit is sent only about ch_current_door_state
instead of current and target characteristic as before. But I might be wrong.
Beside that I didn't get the crash yet again but the reason might be that I changed the implementation of contact_sensor
to match esp-open-rtos interrupt recommended handling https://github.com/SuperHouse/esp-open-rtos/blob/master/examples/button/button.c
via QueueHandle_t
. Maybe this is the key here as both (current & target) homekit notifications were sent from the interrupt handler.
Implementation is here: https://github.com/nonameplum/esp-homekit-demo/blob/my_garage/helpers/interrupt_gpio.c#L44
Exception 9 is for "unALIGNED" access. That means that you're probably trying to access 32bit value at address that is not 32bit aligned.
IMHO this is the price for unofficial devkit. Some memory regions intended to be an 32-bit aligned instruction cache are used for heap which is not aligned. Need to test this on the latest official sdk
Actually, Im a bit dissapointed with rtos general stability. I have arduino esp projects running smooth for years nonstop. Need to try mongoose-os out.
IMHO this is the price for unofficial devkit.
But can you smell freedom and openness?! :)
Some memory regions intended to be an 32-bit aligned instruction cache are used for heap which is not aligned.
You can align stuff yourself, no need for waiting when big daddy aligns it for you. Also, I think that if you allocate something, it is already aligned. I think it is more complicated than that. Also it was seen only in particular firmware which makes me think that it is something in user code.
Actually, Im a bit dissapointed with rtos general stability. I have arduino esp projects running smooth for years nonstop. Need to try mongoose-os out.
Good luck implementing cryptography on those! I bet you do not have HomeKit accessories running on Arduino Core.
I only want to add that esp-open-rtos is VERY stable, and a single esp8266 chip using esp-homekit framework can support a dozen of different HomeKit accessories (even more) without any problem.
As we do offtopic already. Isn't for the future projects ESP-IDF (Apache 2.0 as far as I can see so license is not a problem) a better choice as is actively maintained opposed to esp-open-rtos?
Btw since I have made a change with the interrupt implementation my esp8266 is already working very stable and I didn't encounter any crash yet.
@nonameplum, @RavenSystem, any good expirience with remote access? I have posted a new Issue
@nonameplum Thanks for the update. Indeed, most button implementations execute callbacks right inside interrupt handler to avoid making it heavyweight in cases you do not need it (or need it done in a different way), so it is your responsibility to design callbacks to be lightweight and postpone all heavy lifting.
As of supporting official Espressif SDK for ESP8266: I had it done, but never pushed the change because it moved a lot of stuff in demo repository. I plan to revisit that, freshen it and push in near future.
Good luck implementing cryptography on those!
Thanks, actually trying now mongoose-os with esp32-homekit I have used this library on ARM controllers with 50+ devices, and was happy. Will see how it runs on ESP...
Thanks, actually trying now mongoose-os with esp32-homekit
Seems like project is stale
I think we can close this issue as I'm not getting the crash anymore. So I assume that the proper handling of the interrupt https://github.com/maximkulkin/esp-homekit-demo/issues/263#issuecomment-569747881 solves the crash.
Logs: