espressif / esp-mqtt

ESP32 mqtt component
Apache License 2.0
610 stars 257 forks source link

Error write data or timeout (IDFGH-2514) #48

Closed ghost closed 4 years ago

ghost commented 6 years ago

Hi,

I've been using quite a while this library but after few git pull updates, it stopped working and I'm getting the follow message:

E (151464) MQTT_CLIENT: Error write data or timeout, written len = -1
E (151464) MQTT_CLIENT: Error to public data to topic=/data/0001/, qos=0
E (151474) MQTT_CLIENT: Error write data or timeout, written len = -1
E (151474) MQTT_CLIENT: Error to public data to topic=/data/0001/, qos=0
E (151494) MQTT_CLIENT: Error write data or timeout, written len = -1
E (151494) MQTT_CLIENT: Error to public data to topic=/data/0001/, qos=0

The client connects successfully to MQTT Server:

D (16414) MQTT_CLIENT: MQTT client_id=teste
D (16414) TRANS_TCP: [sock=54],connecting to server IP:192.168.1.92,Port:1883...
D (16784) MQTT_CLIENT: Transport connected to mqtt://192.168.1.92:1883
I (16794) MQTT_CLIENT: Sending MQTT CONNECT message, type: 1, id: 0000
D (16804) MQTT_CLIENT: Connected
I (16804) MY_LOGGER: MQTT_EVENT_CONNECTED

I've tried with qos=0,1 and 2 and the same error happens (output differ a little bit 'cause 0 we don't enqueue the message).

I'm using username and password, at port 1883, disable_clean_session=true.

Using mosquitto_pub I can easily publish messages:

mosquitto_pub -h <host> -u <username> -P <password> -t '/data/0001/' -m "{}"

The same credentials and configs are used for mqtt_client (esp32) and mosquitto_pub.

What can I do to try to figure out what's happening?

Thanks in advance.

klotzt commented 6 years ago

same issue for me, subscribe works but publish results in the exact same error message

btotaro-olibra commented 6 years ago

I've found messages exceeding a certain size fail. and i've found large messages that do transmit, sometimes cause the ESP32 to break wifi connectivity (dissociate, and other terrible things)

tuanpmt commented 6 years ago

@btotaro-olibra Could you please share the details?

b0m0x commented 6 years ago

This is probably related to espressif/esp-idf#1987 The proposed workaround is to set CONFIG_USE_ONLY_LWIP_SELECT.

ghost commented 6 years ago

Hi @b0m0x ,

I can confirm that setting CONFIG_USE_ONLY_LWIP_SELECT to true fixes the issue.

Follow my new output after changing that flag:


I (92095) LOG: MQTT_EVENT_CONNECTED
I (92095) wifi: ic_enable_sniffer
D (93095) MQTT_CLIENT: Sent PING successful
D (93145) MQTT_CLIENT: msg_type=13, msg_id=0
D (93145) MQTT_CLIENT: MQTT_MSG_TYPE_PINGRESP
D (96535) MQTT_CLIENT: mqtt_enqueue id: 0, type=1 successful
D (96545) MQTT_CLIENT: mqtt_enqueue id: 48938, type=3 successful
D (96545) OUTBOX: ENQUEUE msgid=48938, msg_type=3, len=133, size=133
D (96565) MQTT_CLIENT: msg_type=4, msg_id=48938
D (96565) MQTT_CLIENT: pending_id=55860, pending_msg_count = 2
D (96565) OUTBOX: DELETED msgid=48938, msg_type=3, remain size=0
D (96565) MQTT_CLIENT: received MQTT_MSG_TYPE_PUBACK, finish QoS1 publish
I (96575) LOG: MQTT_EVENT_PUBLISHED, msg_id=48938
D (96585) MQTT_CLIENT: msg_type=4, msg_id=55860
D (96585) MQTT_CLIENT: pending_id=55860, pending_msg_count = 1
D (96595) MQTT_CLIENT: received MQTT_MSG_TYPE_PUBACK, finish QoS1 publish
I (96595) LOG: MQTT_EVENT_PUBLISHED, msg_id=55860

To do the fix, go to "make menuconfig" then "Component Setting" -> "LWIP" -> "Support LWIP socket select() only". This option was recently added.

screen shot 2018-05-30 at 12 09 43
ghost commented 6 years ago

Should we close this issue or let's call it just a workaround and wait till the problem is "natively" fixed?

Thomseeen commented 5 years ago

Using the ESP-MQTT_FOR_IDF_3.1 release I still get that error on publishing data with over 16kB while having CONFIG_USE_ONLY_LWIP_SELECT set to true. Having lwip debug enabled my log looks like this when i try to publish a 17kB buffer:

tcp_write(pcb=0x3ffdf778, data=0x3ffe436c, len=5744, apiflags=3)
tcp_write: queueing 6536:7972
tcp_write: queueing 7972:9408
tcp_write: queueing 9408:10844
tcp_write: queueing 10844:12280
tcp_output_segment: 6536:7972
tcp_output_segment: 7972:9408
tcp_output_segment: 9408:10844
tcp_output_segment: 10844:12280
+-+-+-+-+-+-+-+-+-+-+-+-+-+- tcp_input: flags -+-+-+-+-+-+-+-+-+-+-+-+-+-+
tcp_process: Connection RESET
E (4691) MQTT_CLIENT: Error write data or timeout, written len = -1
E (4691) MQTT_CLIENT: Read error or end of stream
E (4701) MQTT_CLIENT: Error to public data to topic=/topic/qos0, qos=0

For some reason, looking at Wireshark, the first TCP-package gets followed by a response from my computer with a ACK and RST flag. Following that the other three packages shown above still reach my computer but get ignored.

Edit: Seems likethere is an issue with the remaining length in the messages header. Sending a message with 16kB data and a topic length of 11B causes the header to be

0x30, // Publish, no DUP, QoS 0, no Retain
0x8D, 0x7D, // remaining length of 16013B
0x00, 0x0B, // topic length 11B
topic...

which works fine and is right according to MQTT 3.1. Sending a message with 17kB data and a topic length of 11B causes the header to be

0x30, // Publish, no DUP, QoS 0, no Retain
0xF5, 0x84, // ... this is wrong. Beginning with the LSB (0x84) there is a continuation bit
               which would indicate a 3B long remaining length field. So the following 0x00 gets 
               included into the remaining length field too. As a result the topic length field ix 0x0B 
               and the first byte of the topic. That causes a wrong topic length and unreadable 
               MQTT package. The right remaining length would be 3B (0xF5, 0x84, 0x01).
0x00, 0x0B, // topic length 11B
topic...
david-cermak commented 5 years ago

Hi all,

Can we close this older issue? It looks like this was resolved in IDF, at least I cannot reproduce any connection problems related to CONFIG_USE_ONLY_LWIP_SELECT enabled or disabled

david-cermak commented 4 years ago

Closing as no recent feedback. Please reopen if needed.