Closed canatella closed 1 year ago
@canatella, thanks for the work. Can you explain why do you think that the handler/status_update pointer becomes invalid?
Honestly, I just got a clue by the fact that the stack trace is pointing at https://github.com/uriyacovy/NukiBleEsp32/blob/1de629002a03e6fffc9acd8b746baf498b50f77b/src/NukiBle.cpp#L244 where there is nothing going on but two pointer access:
Now looking at the doc: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-guides/fatal-errors.html#loadprohibited-storeprohibited: LoadProhibited, StoreProhibited
These CPU exceptions happen when an application attempts to read from or write to an invalid memory location. The address which has been written/read is found in the EXCVADDR register in the register dump. If this address is zero, it usually means that the application has attempted to dereference a NULL pointer. If this address is close to zero, it usually means that the application has attempted to access a member of a structure, but the pointer to the structure is NULL. If this address is something else (garbage value, not in 0x3fxxxxxx - 0x6xxxxxxx range), it likely means that the pointer used to access the data is either not initialized or has been corrupted.
According to the stack trace: EXCVADDR: 0x000000ff
so it looks like we are dereferencing a pointer in a NULL struct, the offset is 255 bytes which is quite large and also looks like a strange value to me, but that doesn't allow differentiating between the two dereferences without looking at the struct generated by the compiler (which is possible but maybe not worth it here).
Thanks for the analysis. Yet, it does not point to which pointer was invalid. If this issue reproduces easily (in my setup it does not occur), we can try to isolate and maybe understand better. I think that the first approach would be to define the just the handler as a member instead of a pointer, and see if it solves the issue.
Is there any specific reason why you want to use pointers there? It's much more safe to use RAII and entirely avoid this issue and possibly others. If there is undefined behavior here for some reason, there is no way to say what else could go wrong. Also, by avoiding using new, you also prevent memory allocation that could fail and the compiler will leave room in the image for the NukiLock struct so that it wouldn't flash if there wasn't enough memory for it.
The idea is to be able to build NukiLock in runtime, with relevant configuration (device name and ID) and maybe support multiple locks. This is currently not implemented, but this is the reason why I suggested to start with the handler and see if it solves the issue.
I don't think you would need pointers for that, you could just use a vector:
std::vector<NukiLockComponent> locks;
And with the code in the PR it would still create the NukiLock and register the handler.
@canatella, I've merged your PR to dev branch. Thanks for the contribution!
No prob, thanks for your work :)
I managed to trace what I think is issue #26 :
Looking at the code, I thought this was probably due to one of the pointers (the handler, or the
status_update_
pointer) getting invalid at some point and the crash leaving the bluetooth stack in an inoperable state.This code replaces the lock instance and the handler with member variables and configures event handling in the constructor so that no pointers are needed anymore (and so none can get invalid).
This seems to fix issue #26 for me but I still haven't used it a lot. But the crash that I dumped here is fixed for sure.