LabVIEW-Open-Source / MQTT-Client

A LabVIEW-based client for MQTT
Other
27 stars 4 forks source link

Memory leak in SendPacket.vi #4

Closed kjkirkegaard closed 3 years ago

kjkirkegaard commented 3 years ago

Hi

In the search for a explanation for why it will take in average 1.4ms to publish to a topic I think I found a memory leak in the SendPackage.vi. We only use QoS = 0 in this application because we need speed, and we can survive loosing packages. With my knowledge when sending a package with QoS = 0 the MQTT Broker will not replay with an ACK and here the problem starts, because every time we send a package the PackageID is stored in the PackageIDs DVR. image but is will not be taken off the PackageIDs array because no ACK is returned to the MQTT Client and therefore the array will keep growing.

One solution to this could be to use the Wait for ack? to not save the PacketID when we don't need to wait for a ACK. But I'm not 100% sure that the PackageID is only used for verifying ACK if not then this is not the correct solution.
image image

But I did not find a solution to why it takes in average 1.4ms to publish to a topic.

Regards Kaare Juul Kirkegaard

francois-normandin commented 3 years ago

That's a good observation. This has been put in place to handle QoS and I agree with you that it should skip it for QoS = 0. As for the 1.4ms latency, can you open another issue describing how you decoupled the part of latency due to connection from the internal packet handling? I'd like to create an official benchmark for the toolkit, so we can leverage the unit test framework in assessing this important parameter.

kjkirkegaard commented 3 years ago

Yes I will start a new issue and explain how I measured the publish time.

francois-normandin commented 3 years ago

Memory Leak

Memory leak will be fixed in the upcoming release. @kjkirkegaard Thanks much for the report.

Latency

I just tested the following pathway, which is to publish a raw payload in QoS = 0: image (image taken from https://github.com/LabVIEW-Open-Source/LV-MQTT-Broker/wiki/Publish-message-with-QoS-=-1)

The payload used was 4 bytes, for a packet size of 15 bytes. To decouple the effect of a server, I ran a parallel loop with a TCP Read, completely ignoring the time to unpack the 15 bytes and process to extract topic name. This setup also isolated the effect of the broker receiving and relaying the incoming bytes to the session/subscription engine.

With this simple setup, the client was dispatching the Publish packet to a session loop in ~48us. I did not try to measure the session process alone, as it runs in a parallel thread, but considering that the TCP read is probably negligeable, the second leg (session handling the bytes, sending by TCP and receiving in a second loop) ran in about 35us when my processor was not doing something else in the background. (Each point is the result of averaging the latency measured from 1000 packets, or 15000 bytes). Doing it on 10 000 packets didn't change the result significantly.

This is by no mean a perfect test because I am not knowledgeable about the inner workings of a TCP connection locally (does it implement the Nagle algorithm, for example?), but the point I was trying to test was that I didn't want to measure the connection latency. This is still much faster than the 1.4ms being reported. Of course, I used a small payload... Could it be that the serialization is very inefficient? I didn't use any serialization in my test.

image

That does not mean there are no efficiency gains to be found! And I'd be grateful for any benchmark suggestion that I can add to the test suite...

I'll be adding the VI into the repo. It's called "Test Publish QoS 0 Latency.vi" image

francois-normandin commented 3 years ago

Fixed and released (3.1.5). Coming soon on VIPM Community.

kjkirkegaard commented 3 years ago

Thanks you for the update.

Regarding the latency then I would just add that I also measured the time of the publish, and when I run my test on my PC I get comparable numbers. But I need this to run in LabVIEW RT (PharLab) and there the same measurement show an average of 1.4ms. The target is PXI controller with 8 cpu cores.

francois-normandin commented 3 years ago

benchmarking can be followed in the new topic #7