eclipse-paho / paho.mqtt.embedded-c

Paho MQTT C client library for embedded systems. Paho is an Eclipse IoT project (https://iot.eclipse.org/)
https://eclipse.org/paho
Other
1.37k stars 757 forks source link

sendPacket() sends wrong number of bytes #177

Open ghost opened 5 years ago

ghost commented 5 years ago

A very similar issue was apparently already reported back in 2015:

On 2015-02-19 18:15:55 -0500, Joe Planisky wrote:

MQTT::Client::sendPacket uses a 'while' loop to ensure all data in sendBuf is sent. However, in the call to 'ipstack.write', the length parameter is never updated to account for data already sent.

Related issue: https://github.com/eclipse/paho.mqtt.embedded-c/issues/19


After having recurring problems with the library on a newly developed STM32F215RET/WIZnet W5500 based test board, I've discovered the same issue in MQTTClient.c. - Packets were occasionally not delivered in my echo test program, when I randomly changed the send frequency of the test echo messages (especially when using QoS 2), or when I changed the MQTTYield() timeout_ms.

This seems to be a rather serious issue.

The problem is that mqttwrite() is repeatedly called to send out the same number of bytes. Instead, the already sent number of bytes should be taken into account:

https://github.com/eclipse/paho.mqtt.embedded-c/blob/develop/MQTTClient-C/src/MQTTClient.c :

static int sendPacket(MQTTClient* c, int length, Timer* timer)
{
    int rc = FAILURE,
        sent = 0;

    while (sent < length && !TimerIsExpired(timer))
    {
        rc = c->ipstack->mqttwrite(c->ipstack, &c->buf[sent], 
            length - sent, /* fix */
            /*length,*/
            TimerLeftMS(timer));
        if (rc < 0)  // there was an error writing the data
            break;
        sent += rc;
    }
    /* ... */
}

(Side Note: In my test application it was safer to check TimerIsExpired() after invoking mqttwrite() at least once, and check if the timer expired after that. This prevented unnecessary timeouts, and kept the test program happy. But this might be not the way how it is supposed to work... and this is not the actual issue here!)

Mauricio-Alencar commented 4 years ago

Good morning, I was looking for ISSUES similar to the current problem that I have and I ended up encountering here, the problem is that today I use a stm32f103rb with a wiznet5500, and I can't send more than 76 characters in a payload in the MQTTPublish function, so where can I modify the accepted shipping size of this payload?

ghost commented 4 years ago

Your issue may be related to mine, it has not been fixed yet.

Have you already tried to fix the sendPacket() function by copy-pasting the snippet above?

According to the MQTT standard, the payload size could in theory be very large (several megabytes in size), see: http://docs.oasis-open.org/mqtt/mqtt/v3.1.1/errata01/os/mqtt-v3.1.1-errata01-os-complete.html#_Toc442180836

And that shouldn't be a problem for the MQTT client library, since the actual buffer memory is external to the library itself, see: MQTTClientInit(), MQTTSerialize_publish(), MQTTSerialize_publishLength()

Have you already checked the linked porting guide, and are you sure if your read()/write() functions for the WIZnet W5500 part are OK?: http://modelbasedtesting.co.uk/2014/08/25/porting-a-paho-embedded-c-client/

When I remember correctly, I've used the linux interface code as a starting point for my STM32/WIZnet implementation, in my tests. The low-level socket code from WIZnet also came with some issues IIRC: https://github.com/eclipse/paho.mqtt.embedded-c/tree/master/MQTTClient-C/src/linux

WIZnet also provides a sample implementation here: https://github.com/Wiznet/ioLibrary_Driver/blob/master/Internet/MQTT/mqtt_interface.h https://github.com/Wiznet/ioLibrary_Driver/blob/master/Internet/MQTT/mqtt_interface.c

But the issue from this pull request is also still present in the provided Paho MQTT Client code by WIZnet as well, see: https://github.com/Wiznet/ioLibrary_Driver/blob/master/Internet/MQTT/MQTTClient.c

scaprile commented 4 years ago

Hi, to help the maintainer I would sum up by saying that: If the whole data is not sent in the first call, later calls do not account for the already sent data in prior calls. Threfore, the total amount of bytes sent in this scenario is not correct. I would also call for this issue to be changed to: sendPacket sends wrong number of bytes if first call to mqttwrite() can't send full length which seems to more clearly define the issue. Your fix looks OK to me. Regards

ghost commented 4 years ago

Hi @scaprile: Thanks for your note! I do agree that the problem is actually very simple, and its solution very clear. Sorry for cluttering this thread with too much unrelated stuff...

I'm going to change the issue title to "sendPacket() sends wrong number of bytes" just to be more precise. - This is a serious issue and I hope it will be fixed soon.

scaprile commented 4 years ago

@0rel I agree with you. Unfortunately there seems to be only one maintainer (afaik) for the whole Paho project. Hopefully other people will be able to find and apply this fix in the meantime. Can you please turn your fix in your repo to a pull request ? This way the maintainer just has to accept it. Thanks.

ghost commented 4 years ago

Thank you for the quick reply. There's the pull request with this tiny fix: https://github.com/eclipse/paho.mqtt.embedded-c/pull/178

I agree with you. Unfortunately there seems to be only one maintainer (afaik) for the whole Paho project.

Sorry to hear that. This and other small issues held me back from using the paho.mqtt.embedded-c in a ongoing embedded project I'm working on, because I was a bit afraid that the project isn't maintained anymore. A good alternative library that also works well on embedded systems is lwmqtt, which I'm using now.

Hopefully other people will be able to find and apply this fix in the meantime.

It would be great to have a well maintained alternative to libmosquitto for small systems, I hope Paho will stay alive!

gordwait commented 1 year ago

This issue is still present in the current repo (as of 2023-08-09) and is causing paho client to get refused by the mosquitto broker on MS Windows. The core problem in my case, using LWIP with this, is that the value "rc" is an error flag, not a "number of bytes written" flag.

In my tests, the connect message gets replicated several times in the connect ethernet frame, then the mosquitto broker rejects the connection due to a protocol error.

this next line returns 0 if it works, or a negative number on failure: rc = c->ipstack->mqttwrite(c->ipstack, &c->buf[sent], length - sent, / fix / /length,/ TimerLeftMS(timer)); if (rc < 0) // there was an error writing the data break; sent += rc; That previous line adds 0 to the sent value if the write was successful.

I tried a test fix, which is to actually add the amount sent to "sent" but then other issues cropped up, haven't figured out why as yet.

I wonder if the paho broker accepts multiple connect messages in a single frame, else this client would never work at all?

CIPop commented 1 year ago

using LWIP with this, is that the value "rc" is an error flag, not a "number of bytes written" flag.

That is a bug in the port of the library, not the library itself. Paho Embedded expects that the response from mqttwrite is the number of bytes actually written or negative on error (This is a very common I/O model - e.g. FreeRTOS's send).

I'm not that familiar with the repo: if the LWIP port is within the repo, please add a direct link (I couldn't find it). If the code is yours, you would need to write your own mqttwrite implementation that follows this guideline. I do agree that the project needs more documentation - I plan to add some more details, at least for the C client in the near future.

gordwait commented 1 year ago

Ah, that makes sense. I am responsible for the "shim" layer between the lwip ethernet library and MQTT, so I will edit my "mqttwrite" function accordingly. I am developing on an Intel/Altera NIOS cpu in an FPGA. Altera provided LWIP 1.3.2 as part of their "bare metal" OS. I am trying to get it to run an MQTT client.

https://savannah.nongnu.org/projects/lwip/

CIPop commented 1 year ago

I'm no LWIP expert, but would expect the Socket send() API to behave the same way as BSD Sockets (what Paho Embedded expects)

Also maybe relevant: from what I see, MQTTClient-C can only operate with blocking APIs. If your call to LWIP write is non-blocking, you would either need to:

gordwait commented 1 year ago

I'm not an LWIP expert either ;) Where do I find examples of using MQTTPacket directly? I ran into the non blocking issue already, my code is trying to connect to the broker before the TCP SYN handshake is completed, causing errors. I am trying to add MQTT to an existing design with a "bare metal" setup, that has to manually poll the ethernet to keep things moving. Not going well so far. I'm very tempted to try to run freeRTOS instead, but that is a whole other can of worms to open..

scaprile commented 1 year ago

lwIP can work in RAW mode and then it is an event-driven architecture, way far from Berkeley sockets. It can only support sockets through an OS, so if it is baremetal then it is the RAW API. Your app gets called when a frame is received, there is no buffering. Seach for NO_SYS in lwipopts.h, if it is =1, then there is no (op) sys. 1.3 is veeeeeeeeery old, around 2010 probably, and if it was "enhanced" by your vendor, well, good luck... I worked with 1.4, I think 1.3 had a TCP buffer so when you send that goes to the buffer, something is sent, and your application gets called back when the other end ACKs. Many MQTTPacket examples used to be in the samples section: https://github.com/eclipse/paho.mqtt.embedded-c/tree/master/MQTTPacket/samples

CIPop commented 1 year ago

More specifically, the non-blocking sample is here: https://github.com/eclipse/paho.mqtt.embedded-c/blob/master/MQTTPacket/samples/pub0sub1_nb.c You will need to implement a different transport layer (transport.h from the sample code) or your own. The non-blocking receive is here: https://github.com/eclipse/paho.mqtt.embedded-c/blob/master/MQTTPacket/samples/transport.h#L20

From what I see, unfortunately, transport_sendPacketBuffer is always blocking. You would need to make sure you block until data was actually sent and the buffer released for reuse.

scaprile commented 1 year ago

That one is actually my evil hand at work. I actually wrote a non-blocking function for that task. The issue was something the likes of the code assuming most data had been received, I was on a low speed network and that was not true. For the transmit part you can almost always buffer and send later, that is what I did IIRC. I was working with a cell modem and needed to send data via serial, so I buffered it and kept sending in background with yet another state machine "just put a queue there, to decouple"...