technyon / nuki_hub

Use an ESP32 as a Hub between a NUKI Lock and your smarthome.
MIT License
530 stars 40 forks source link

ioBroker nuki mqtt -> Cannot parse "lock/json" #344

Closed jolichter closed 6 months ago

jolichter commented 7 months ago

Hi, it seems that there's an issue parsing the JSON string that I receiving in ioBroker. This error is hard to reproduce as it doesn't occur consistently (maybe once a day):

Client [nukihub] Cannot parse "nuki/lock/json": {"lock_state":"locked","trigger":"manual","last_lock_action":"Lock","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"��?"}

Typically, the JSON string looks like this:

{"lock_state":"locked","trigger":"manual","last_lock_action":"Lock","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":""}

I guess, the parse error is likely due to the strange characters in the "auth_name" field? Is this known, or does anyone know a solution for it?

MQTT works with the Nuki Hub without any problems, I can see all the data and operate the lock. Therefore, I think that this error does not come from ioBroker. Only this error is irritating. I don't actually need this json data field.

technyon commented 7 months ago

Hi,

I've reviewed the code and what I think is happening that the memory region that stores the name is not initialzed on boot, so this probably happens after if your ESP reboots and then publishes the lock json. Give the binary below a try.

nuki_hub-8.34-pre-5.zip

iranl commented 7 months ago

@technyon Agree, this is known behaviour when serializing non existing/destroyed pointers using ArduinoJson

jolichter commented 7 months ago

Yes, that sounds good. With version 8.33 and the ESP32-M5Stack ATOMLite I only had this message sometimes. Now I'm testing this with an ESP32-S3 and actually get the error after every restart. Is it possible to get a nuki_hub-8.34-pre-5.zip for the S3?

iranl commented 7 months ago

esp32-s3-assets.zip

jolichter commented 7 months ago

I deleted the S3 and reflashed it, but now I can't find WiFi (192.168.4.1): test4

iranl commented 7 months ago

Did you erase? There is no need for that. You can just do ota from the web interface without erasing and losing any config.

The 192.168.4.1 ip is only available when connected to the ESPs own AP during wifi config. When not erasing the ESP will still just be on the regular WIFI with a static or DHCP ip in your normal IP range

jolichter commented 7 months ago

yes, i wanted to start all over again. i will now install pre4 again (because that worked) and if that works try to install pre5 via ota...

jolichter commented 7 months ago

@iranl: OK, I completely reflashed 8.34-Pre4 and then installed the file nuki_hub_esp32s3.bin from your last esp32-s3-assets.zip via ota. Hope this is the right way, I'm not sure because the firmware still shows 8.34-pre-4.

iranl commented 7 months ago

I'm not at my development system right now so the binary I supplied is just the automatically by Github generated binary of master after @technyon merged the fix for your issue. It is expected to still show pre4 and also does not include #332.

It is meant just to test if your issue is resolved.

jolichter commented 7 months ago

Thanks for the explanation, I will now keep an eye on it...

technyon commented 7 months ago

I only locally changed the version to -pre-5, so the binary from github will still show -pre-4

jolichter commented 7 months ago

Hi, I have just forced a RestartOnDisconnectWatchdog for a test and unfortunately the error is still there. I see the error at these 2 data points:

2024-04-28 18:54:31.061 - error: mqtt.0 (82246) Client [nukihub] Cannot parse "nuki/lock/json": {"lock_state":"locked","trigger":"manual","last_lock_action":"Unlatch","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"��?"}
2024-04-28 18:54:31.167 - error: mqtt.0 (82246) Client [nukihub] Cannot parse "nuki/presence/devices": {"lock_state":"locked","trigger":"manual","last_lock_action":"Unlatch","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"��?"}

Screenshot_mqtt-error

Every time I open or close the door, I now get this error: 2024-04-28 19:09:01.771 - error: mqtt.0 (82246) Client [nukihub] Cannot parse "nuki/lock/json": {"lock_state":"locked","trigger":"manual","last_lock_action":"Unlatch","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"��?"}

System Information:

System Information
Nuki Hub version: 8.34-pre-4
run: true
confVersion: 834
deviceId: 439958162
deviceIdOp: 439958162
nukiId: ***
nukidOp: 
mqttbroker: 192.168.2.164
mqttport: 1883
mqttuser: ***
mqttpass: ***
mqttlog: false
checkupdates: false
lockena: true
lockpin: 1
mqttpath: nuki
openerena: false
openerpin: 
openercont: false
mqttoppath: 
maxkpad: 
opmaxkpad: 
mqttca: 
mqttcrt: 
mqttkey: 
hassdiscovery: 
hassConfigUrl: 
dhcpena: false
ipaddr: 192.168.2.188
ipsub: 255.255.255.0
ipgtw: 192.168.2.1
dnssrv: 
nwhw: 1
nwwififb: false
rssipb: 60
hostname: nukihub
nettmout: -1
restdisc: true
rstbcn: 60
lockStInterval: 1800
configInterval: 3600
batInterval: 1800
kpInterval: 1800
kpCntrlEnabled: true
kpInfoEnabled: false
aclLckOpn: 
tcCntrlEnabled: false
tcInfoEnabled: false
confLckBasAcl: 
confLckAdvAcl: 
confOpnBasAcl: 
confOpnAdvAcl: 
accLvl: 
regAsApp: false
nrRetry: 3
rtryDelay: 100
crdusr: ***
crdpass: ***
pubAuth: false
pubdbg: false
prdtimeout: 60
hasmac: false
macb0: 
macb1: 
macb2: 
latest: 
MQTT connected: Yes
Lock firmware version: 
Lock hardware version: 
Lock paired: No
Lock PIN set: -
Lock has door sensor: No
Lock has keypad: No
Lock ACL (Lock): Allowed
Lock ACL (Unlock): Allowed
Lock ACL (Unlatch): Allowed
Lock ACL (Lock N Go): Allowed
Lock ACL (Lock N Go Unlatch): Allowed
Lock ACL (Full Lock): Allowed
Lock ACL (Fob Action 1): Allowed
Lock ACL (Fob Action 2): Allowed
Lock ACL (Fob Action 3): Allowed
Lock config ACL (Name): Disallowed
Lock config ACL (Latitude): Disallowed
Lock config ACL (Longitude): Disallowed
Lock config ACL (Auto Unlatch): Disallowed
Lock config ACL (Pairing enabled): Allowed
Lock config ACL (Button enabled): Allowed
Lock config ACL (LED flash enabled): Allowed
Lock config ACL (LED brightness): Disallowed
Lock config ACL (Timezone offset): Disallowed
Lock config ACL (DST mode): Disallowed
Lock config ACL (Fob Action 1): Disallowed
Lock config ACL (Fob Action 2): Disallowed
Lock config ACL (Fob Action 3): Disallowed
Lock config ACL (Single Lock): Allowed
Lock config ACL (Advertising Mode): Disallowed
Lock config ACL (Timezone ID): Disallowed
Lock config ACL (Unlocked Position Offset Degrees): Disallowed
Lock config ACL (Locked Position Offset Degrees): Disallowed
Lock config ACL (Single Locked Position Offset Degrees): Disallowed
Lock config ACL (Unlocked To Locked Transition Offset Degrees): Disallowed
Lock config ACL (Lock n Go timeout): Disallowed
Lock config ACL (Single button press action): Disallowed
Lock config ACL (Double button press action): Disallowed
Lock config ACL (Detached cylinder): Disallowed
Lock config ACL (Battery type): Disallowed
Lock config ACL (Automatic battery type detection): Disallowed
Lock config ACL (Unlatch duration): Disallowed
Lock config ACL (Auto lock timeout): Disallowed
Lock config ACL (Auto unlock disabled): Allowed
Lock config ACL (Nightmode enabled): Disallowed
Lock config ACL (Nightmode start time): Disallowed
Lock config ACL (Nightmode end time): Disallowed
Lock config ACL (Nightmode auto lock enabled): Disallowed
Lock config ACL (Nightmode auto unlock disabled): Disallowed
Lock config ACL (Nightmode immediate lock on start): Disallowed
Lock config ACL (Auto lock enabled): Allowed
Lock config ACL (Immediate auto lock enabled): Disallowed
Lock config ACL (Auto update enabled): Disallowed
Network device: Built-in Wi-Fi
Uptime: 0 minutes
Heap: 154076
Stack watermarks: nw: 7732, nuki: 4012, pd: 100
Restart reason FW: RestartOnDisconnectWatchdog
Restart reason ESP: ESP_RST_SW: Software reset via esp_restart.
jolichter commented 7 months ago

PS: as mentioned above, I had completely reflashed version 8.34-pre-4 and then imported the file nuki_hub_esp32s3.bin from esp32-s3-assets.zip via ota.

iranl commented 7 months ago

I added a second fix in #345. Binary: esp32-s3-assets.zip

Note: nuki/presence/devices should never be a JSON string. I can't explain that topic having the same value as nuki/lock/json. Please clear the topic using something like MQTT explorer.

jolichter commented 7 months ago

Thanks for your efforts! I have recreated the mqtt data points in ioBroker and switched the S3 off and on again (reset due to power-on event). The error in the nuki/presence/devices data point no longer appears, but still always in nuki/lock/json:

2024-04-28 20:34:05.154 - error: mqtt.0 (102711) Client [nukihub] Cannot parse "nuki/lock/json": {"lock_state":"locked","trigger":"manual","last_lock_action":"Unlatch","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"��?"}
2024-04-28 20:34:28.842 - error: mqtt.0 (102711) Client [nukihub] Cannot parse "nuki/lock/json": {"lock_state":"unlocked","trigger":"manual","last_lock_action":"Unlatch","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"��?"}

Even if it doesn't solve the error, it would be enough for me if I could deactivate this json in Nuki Hub.

jolichter commented 7 months ago

Hi, I’m not so experienced in programming and usually only write smaller snippets of code. So, it’s a bit challenging for me to fully grasp this issue. It seems like there might be a problem with how authName is being initialized, which is causing odd characters in my MQTT message.

In the publishAuthorizationInfo function (Fix authName), authName appears to be set to null characters with memset. Could it be that authName isn’t always being overwritten with a valid string, which could be the reason for the strange characters? It reminds me of a similar encoding issue I had with a MySQL database as I switched my CMS from MODX EVO to MODX REVO (Unicode/UTF-8 encoding problems).

Please excuse me if this question seems a bit "naive." If it doesn’t make sense, feel free to ignore it :-)

technyon commented 7 months ago

Hi, does the problem still occur with the second fix too?

authName is initialized with zeros now. In C++, a string is just an array of characters, which gets a certain memory address allocated by the compiler. In contrast to modern programming languages, C++ doesn't initialize the memory for you, so there will be random characters if not initialized, which I fixed already.

What is writtten after that comes directly via bluetooth from the lock. It's memcpy from the response of the lock into authName, this is plain ascii, no UTF8 or anything like that. There's no other obvious place in the code that would overwrite the string - unless it's something less obvious like a buffer overrun that writes into the wrong memory address.

What is supposed to be in authName at this point? An empty string or a name?

jolichter commented 7 months ago

Hi, yes I installed the second fix esp32-s3-assets.zip.

Very rarely a JSON string is also stored in nuki/presence/devices with the same content as in nuki/lock/json. Which is of course wrong, could it possibly be an overflowed buffer? 344#issuecomment-2081559601

Yes, an empty string would be OK regarding your question 'An empty string or a name?'.

jolichter commented 7 months ago

OK, C++ is not my cup of tea. I was looking for memcpy, would strncpy be worth a try instead? Or am I on the wrong track?

Found this on stackoverflow: "The main difference is that memcpy will copy all N characters you ask for, while strncpy will copy up to the first null terminator inclusive, or N characters, whichever is fewer."

Accordingly, the strncpy function can be used to copy a maximum number of characters and ensure that the target string does not overflow. It also ensures that the target string is properly terminated.

In Gamba's Basic or PHP I would also check the length of log.name before copying. I assume that it is about these lines (from log.name in authName)? Row 388:

        if((log.loggingType == NukiLock::LoggingType::LockAction || log.loggingType == NukiLock::LoggingType::KeypadAction) && ! authFound)
        {
            authFound = true;
            authId = log.authId;
            memcpy(authName, log.name, sizeof(log.name));
        }

Could this work? Code:

if((log.loggingType == NukiLock::LoggingType::LockAction || log.loggingType == NukiLock::LoggingType::KeypadAction) && !authFound)
{
    authFound = true;
    authId = log.authId;

    // Determine the length of log.name, but not more than the size of authName - 1
    size_t nameLength = strnlen(log.name, sizeof(authName) - 1);

    // Copy the data with strncpy and make sure that no overflows occur
    strncpy(authName, log.name, nameLength);

    // Set the last character of authName to the null terminator
    authName[nameLength] = '\0';
}
technyon commented 7 months ago

Hi,

Usually one would indeed use strcpy, but in this case memcpy is better. strcpy basically copies zero-terminated strings, so it copies until it finds a 0 in memory. The Nuki API sends 32 characters, and the name can be up to 32 characters long. It's either 0 terminated, or it fills the maximum length of 32 chracters. That's why the authName character array is 33 characters long and initialized with 0s before copying. Either it copies the 0 terminated string from the Nuki API, or it copies all 32 characters, and character number 33 in the authName array is 0 and thus terminates the string. Quite a special case indeed.

jolichter commented 7 months ago

OK, pity, it was just an idea.

As I mentioned before, I don't get a 'Cannot parse lock/json ...' if the JSON auth_name was empty. Since the second fix, there is always a value in auth_name and I get the error with something like this:

{"lock_state":"locked","trigger":"manual","last_lock_action":"LockNgo","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"`\n�?S"} or {"lock_state":"unlocked","trigger":"manual","last_lock_action":"LockNgo","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"P�?"}

During further testing, I had this JSON once without an error message:

{"lock_state":"unlocked","trigger":"manual","last_lock_action":"LockNgo","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":"g�*Z�p������6��"}

The value auth_name was longer. Does that help?

One more question, do you also get such strange characters as auth_name?

iranl commented 7 months ago

@jolichter: Would you give this binary a try?

esp32-s3-assets.zip

Note: Use the nuki_hub.bin to do an OTA update from the Web configurator. Version should show 8.33 not 8.34-pre4

jolichter commented 7 months ago

Hi, that looks very good, so I don't get a 'cannot parse error' and auth_name remains empty. Hope you can localize the error in 8.34?

To reproduce:

  1. Firmware nuki_hub.bin loaded via ota (show Firmware 8.33)

  2. ioBroker: MQTT NUKI data points deleted and MQTT restarted, new data points created and everything looks good, no 'cannot parse error': {"lock_state":"unlocked","trigger":"manual","last_lock_action":"LockNgo","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":""}

  3. after -> Restart reason FW: RestartOnDisconnectWatchdog (Restart reason ESP: ESP_RST_SW: Software reset via esp_restart), no 'cannot parse error': {"lock_state":"unlocked","trigger":"manual","last_lock_action":"LockNgo","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":""}

  4. after -> Restart reason FW: NotApplicable (Restart reason ESP: ESP_RST_POWERON: Reset due to power-on event), no 'cannot parse error': {"lock_state":"unlocked","trigger":"manual","last_lock_action":"LockNgo","lock_completion_status":"success","door_sensor_state":"unavailable","auth_id":0,"auth_name":""}

This is just a quick test, but it looks good. I will report back in the next few days to see if it stays that way.

Thank you very much!

jolichter commented 7 months ago

@iranl: Hi, my ESP32-S3 does not report a 'cannot parse error' due to auth_name, but sometimes the ESP does not restart, in which case 2 to 3 new attempts are necessary (power off and on). Could it be because I use the nuki_hub.bin instead of nuki_hub_esp32s3.bin from the last try for testing?

Update: I have updated the file nuki_hub_esp32s3.bin from esp32-s3-assets.zip (Fix authName #345) via ota for testing, will report soon.

jolichter commented 7 months ago

Damn, had several WiFi connect problems and had to restart the ESP32-S3 again and again (power off and on).

Therefore I wanted to start again from the beginning and completely re-install the ESP. I also deleted it in order to set it up again with ESPs own AP (IP: 192.168.4.1), but without success. I can find the ESP32-S3 in the terminal via lsusb, but it does not start its AP (no WiFi).

Screenshot_ESP32_S3_Flash (I took the settings from flash.sh)

I guess that I have installed the firmware incorrectly or that the ESP is defective. If you can't see an error on the print screen, the ESP is probably broken.

Update: Could revive the ESP again with the web installer, but it crashes after every config change after save. The power supply and cable are OK, other ESPs work with them. I am now using my M5Stack ATOMLite with 8.33 again.

Maybe I'll order a new m5stack-StampS3 and test it again. This device had the best WiFi signal strength so far.

jolichter commented 7 months ago

Strong, I am facing a wifi mesh issue ;-(

Unfortunately, ESPs don't support mesh, and this m5stack-StampS3 tends to connect to the furthest repeater with the weakest signal after a restart. I've deactivated it for further testing.

Sorry, it seems to be my problem and not from Nuki-Hub and the Fix authName #345 is working so far.

jolichter commented 6 months ago

Hi, fyi: I guess the Fix authName #345 has solved the problem. I have no cannot parse "lock/json" error since 1 week. Therefore, I thank you very much and close this issue!

jolichter commented 6 months ago

The Fix authName #345 has solved the problem.