phoddie / node-red-mcu

Node-RED for microcontrollers
120 stars 18 forks source link

mqttClient error: invalid operation. Wemos D1 Mini #112

Closed colinl closed 1 year ago

colinl commented 1 year ago

I have a trivial flow consisting just of an mqtt In node feeding a debug node, running on a Wemos D1 Mini. For some particular combinations of topic and data then every time I publish the data to that topic, in xsbug I see mqttClient error: invalid operation and in the mosquitto log I see the client disconnect and reconnect again

2023-04-11 10:04:49: Client node-red-75fb29423f8a0770-1681203830270-6933 closed its connection.
2023-04-11 10:04:50: New connection from 192.168.1.245:22564 on port 1883.
2023-04-11 10:04:50: New client connected from 192.168.1.245:22564 as node-red-75fb29423f8a0770-1681203830270-6933 (p2, c1, k60000).

If I change the topic or the data then it works fine, which is very odd. So far this is the only combination of topic and data that I have found that fails: topic: "tydwr/mcu/config" and data

{
    "analogue_0": {
        "period": 100,
        "topic": "tydwr/charger/adc_mcu"
    },
    "DS18B20": {
        "globalPeriod": 10000,
        "sensors": {
            "28ff0c36511604df": "tydwr/ds18b20/hw_cyl_2",
            "282856ac0300005f": "tydwr/ds18b20/sensor1",
            "28ff550656160308": "tydwr/ds18b20/sensor2"
        }
    }
}

This is the flow in the D1 Mini

[{"id":"ab2bacdb9ebd5d92","type":"mqtt in","z":"17edb54d5e9276ec","name":"","topic":"tydwr/mcu/config","qos":"1","datatype":"json","broker":"75fb29423f8a0770","nl":false,"rap":true,"rh":0,"inputs":0,"x":200,"y":140,"wires":[["6442fc7aa866b0d9"]]},{"id":"6442fc7aa866b0d9","type":"debug","z":"17edb54d5e9276ec","name":"debug 90","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":420,"y":140,"wires":[]},{"id":"75fb29423f8a0770","type":"mqtt-broker","name":"Owl2 for mcu","broker":"192.168.1.83","port":"1883","clientid":"","autoConnect":true,"usetls":false,"protocolVersion":"4","keepalive":"60","cleansession":true,"birthTopic":"","birthQos":"0","birthPayload":"","birthMsg":{},"closeTopic":"","closeQos":"0","closePayload":"","closeMsg":{},"willTopic":"","willQos":"0","willPayload":"","willMsg":{},"userProps":"","sessionExpiry":""}]

This is the Inject node used to publish the data from the machine running the broker

[{"id":"6e26c91d7ad41ff8","type":"inject","z":"77ec6bf6.962f64","name":"mqtt test","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":"10","topic":"tydwr/mcu/config","payload":"{\"analogue_0\":{\"period\":100,\"topic\":\"tydwr/charger/adc_mcu\"},\"DS18B20\":{\"globalPeriod\":10000,\"sensors\":{\"28ff0c36511604df\":\"tydwr/ds18b20/hw_cyl_2\",\"282856ac0300005f\":\"tydwr/ds18b20/sensor1\",\"28ff550656160308\":\"tydwr/ds18b20/sensor2\"}}}","payloadType":"json","x":140,"y":1420,"wires":[["fd9655766a56ec3e"]]}]

I am running the latest node-red-mcu pulled from github. I am not using the node red plugin for this test.

colinl commented 1 year ago

I suspect this is a buffer overflow or uninitialised variable or something similar. Trivial changes to the data can make it fail or run successfully.

phoddie commented 1 year ago

Thanks for the report. I'll try to reproduce this on my set-up. The error being reported, invalid operation, appears to be coming from the MQTT parser of received data. It indicates that the first byte of a message from the broker is an invalid MQTT opcode (either 0 or 15).

phoddie commented 1 year ago

I wasn't able to reproduce the same error. Here's what I tried:

That would work for a bit and then eventually fail with a "fragmented receive unimplemented" exception. That is a reminder to me to implement fragmented MQTT receive. So I did. That's committed. Now the flow runs solidly for me on ESP8266 (several minutes, no exceptions, messages continuously received).

I don't think this is the same error you saw, as it would have cause the ESP8266 to reboot instead of reporting "mqttClient error: invalid operation.

I think you are running a different MQTT Broker, probably something local. Should I try to match that? Other ideas for how I might be able to reproduce this?

colinl commented 1 year ago

No, that isn't the same problem. For me the flow fails immediately on the first message received. I am using mosquitto locally. I will see if I can make it fail with the public test broker.

colinl commented 1 year ago

Ok, I have it failing with test.mosquitto.org. I pulled the latest versions of node-red-mcu and Moddable, and rebuild the moddable tools. I have this flow in node-red

[{"id":"b1d22625332df051","type":"mqtt in","z":"17edb54d5e9276ec","name":"","topic":"tydwr/mcu/config","qos":"1","datatype":"json","broker":"450307207f1c7206","nl":false,"rap":true,"rh":0,"inputs":0,"x":120,"y":120,"wires":[["8e4a85438c2e6f14"]]},{"id":"8e4a85438c2e6f14","type":"debug","z":"17edb54d5e9276ec","name":"debug 87","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":280,"y":120,"wires":[]},{"id":"450307207f1c7206","type":"mqtt-broker","name":"test.mosquitto.org","broker":"test.mosquitto.org","port":"1883","clientid":"","autoConnect":true,"usetls":false,"protocolVersion":"4","keepalive":"60","cleansession":true,"birthTopic":"","birthQos":"0","birthPayload":"","birthMsg":{},"closeTopic":"","closeQos":"0","closePayload":"","closeMsg":{},"willTopic":"","willQos":"0","willPayload":"","willMsg":{},"userProps":"","sessionExpiry":""}]

I build and run it on a Wemos D1 Mini using mcconfig -d -m -p esp/nodemcu ssid=... Then in my local machine I run

[{"id":"775b661a50458a75","type":"inject","z":"77ec6bf6.962f64","name":"mcu config","props":[{"p":"payload"},{"p":"topic","vt":"str"}],"repeat":"","crontab":"","once":false,"onceDelay":"10","topic":"tydwr/mcu/config","payload":"{\"analogue_0\":{\"period\":100,\"topic\":\"tydwr/charger/adc_mcu\"},\"DS18B20\":{\"globalPeriod\":10000,\"sensors\":{\"28ff0c36511604df\":\"tydwr/ds18b20/hw_cyl_2\",\"282856ac0300005f\":\"tydwr/ds18b20/sensor1\",\"28ff550656160308\":\"tydwr/ds18b20/sensor2\"}}}","payloadType":"json","x":140,"y":1540,"wires":[["b9f4ae1e058e93c8"]]},{"id":"b9f4ae1e058e93c8","type":"mqtt out","z":"77ec6bf6.962f64","name":"test.mosquitto.org","topic":"","qos":"1","retain":"false","respTopic":"","contentType":"","userProps":"","correl":"","expiry":"","broker":"f4ab502aee074e99","x":410,"y":1540,"wires":[]},{"id":"028062fe6b495324","type":"mqtt in","z":"77ec6bf6.962f64","name":"tydwr/mcu/config","topic":"tydwr/mcu/config","qos":"1","datatype":"auto","broker":"f4ab502aee074e99","nl":false,"rap":false,"inputs":0,"x":160,"y":1600,"wires":[["ff139ea65bddeba1"]]},{"id":"ff139ea65bddeba1","type":"debug","z":"77ec6bf6.962f64","name":"debug 19","active":true,"tosidebar":true,"console":false,"tostatus":false,"complete":"false","statusVal":"","statusType":"auto","x":400,"y":1600,"wires":[]},{"id":"f4ab502aee074e99","type":"mqtt-broker","name":"test.mosquitto.org","broker":"test.mosquitto.org","port":"1883","clientid":"","autoConnect":true,"usetls":false,"protocolVersion":"4","keepalive":"60","cleansession":true,"birthTopic":"","birthQos":"0","birthPayload":"","birthMsg":{},"closeTopic":"","closeQos":"0","closePayload":"","closeMsg":{},"willTopic":"","willQos":"0","willPayload":"","willMsg":{},"userProps":"","sessionExpiry":""}]

and as soon as I click the Inject node, xsbug shows mqttClient error: invalid operation.

phoddie commented 1 year ago

@colinl – with your updated instructions, I can reliably reproduce the problem! It happens in the simulator as well as the ESP8266, so it isn't an MCU induced issue. I'll try to fix this later today.

phoddie commented 1 year ago

There was a bug in the parser in the short-circuit for 0 length payloads. Your test payload has a length of 256 bytes and the length of 0 in the first byte erroneously triggered the short-circuit. The mistake is here. The fix will be live tomorrow. If you want to verify it before then, here's the updated version:

                // remaining length
                case 1:
                    byte = socket.read();
                    parse.length = byte & 0x7F;
                    if (byte & 0x80)
                        parse.state = 2
                    else {
                        parse.state = parse.operation << 4;
                        if (!parse.length) {        // no payload - handle immediately (PINGREQ, PINGRESP, DISCONNECT)
                            if (this.#parsed(parse))
                                return;
                            parse = {state: 0};
                        } 
                    }
                    break;

I've confirmed this works on both the simulator and ESP8266.

Thanks again for taking the time to isolate this to a simple test case. That made fixing this very straightforward.

colinl commented 1 year ago

Your test payload has a length of 256 bytes

I seem to have a knack of hitting issues like that.

colinl commented 1 year ago

Excellent, I applied the patch and it is working well.

phoddie commented 1 year ago

Woohoo!

colinl commented 1 year ago

Have you had a chance to commit this? I am not seeing it yet.

phoddie commented 1 year ago

Thanks for the reminder. Friday got busy. We posted the update earlier today.

colinl commented 1 year ago

Latest source from repo tested. Closing this as fixed.

phoddie commented 1 year ago

Excellent, thank you.