mqttjs / MQTT.js

The MQTT client for Node.js and the browser
Other
8.57k stars 1.42k forks source link

publish disconnections #227

Closed manolodd closed 9 years ago

manolodd commented 9 years ago

HI Matteo,

I've been doing some proofs using this architechture:

scenario a)

scenario b)

Results scenario a)

Results scenario b)


I've tried developing a Java test using Eclipse paho for both, TLS and not TLS clients. Using eclipse paho everything is ok using the same mosquitto broker.

I've tried developing a test using mosquitto_pub for both, TLS and not TLS clients. Using mosquitto_pub everything is ok using the same mosquitto broker (although it uses connect-publish-disconnect).

So, as It seems not to be a problem of Mosquitto configuration, do you know if there is any known problem, specially using secureClient to publish lots of messages at a medium/high rate? It seems like a memory leak or something similar. Or maybe I'm doing something wrong, but I do not know what. Is anything I can tune to modify the behaviour of MQTT.js secureClients in relation to this problem?

Thank you so much.

mcollina commented 9 years ago

First: upgrade your node.js. This is unrelated, but I'm definitely not supporting that version :). Secondly, also unrelated, MQTT is a protocol optimized for low bandwidth and usage. Definitely not something you want to run lots of messages into.

Anyway, how many messages are you trying to send?

Said both points, you should be able to do it without too many issues. However, in order for your messages to be enqueued to delivery you need to relinquish the event loop, i.e. make it tick (it's when it delivers all the I/O). A good trick is to use setImmediate.

Finally, there is a known issue in this library that cause that problem. I've been working on it for a while, and hopefully it will be solved in the next weeks/months. If you need it urgently, contact me directly.

manolodd commented 9 years ago

Hi Matteo, this is the code I use. It's very simple:

var mqtt = require('mqtt');

var config = {}; config.mqtt = {};

config.mqtt.options = { username: 'AnUser', password: 'APassword', keepalive: 20, clean: true, clientId: 'clientID' };

var client; client = mqtt.createClient(1883, '192.168.1.7', config.mqtt.options); client.on('connect', onConnect); client.on('message', onMessage); client.on('close', onClose); client.on('disconnect', onDisconnect); client.on('error', onError);

var counter = 0; var max = 10000; var frequency = 10; var io;

function onConnect() { console.log('Connected'); io = setInterval(publish, frequency); }

function onMessage(topic, message) { console.log('Message received'); }

function onClose(packet) { console.log('Connection closed'); }

function onDisconnect(packet) { console.log('Disconected'); }

function onError() { console.log('error!!'); }

function publish() { setImmediate(function () { client.publish("/testTopic", "Msg: " + counter); }); if (counter >= max) { clearInterval(io); } counter++; }

I've tried setting a keepalive value of 200 (instead of 20) and everything goes 'better'. It seems that because of the stream is being used to publish messages, the ping response is delayed and it does timeout to expire. Isn't it?

If there is a bug that causes this problem and you expect to solve it in next months, it's OK for me. I'm not in a hurry. We still have to code for 4-6 additional months :-)

I know that MQTT is intended for low bandwidth and usage. But I've a three levels Mosquitto hierarchy and the root of the hierarchy receives the aggregated messages of the rest. Behind this root Mosquitto, there is a node.js backend that send responses to every received message; I was testing whether Mosquitto was able to support a "high" traffic rate or not and the overall behaviour of MQTT hierarchy. Mosquitto supported it, but then mqtt.js diconnection issues happened several times. I did not expect it as we chose Node.js because of its network performance.

Thanks for all, Matteo.

manolodd commented 9 years ago

One more thing. As I said before, MQTT.js is able to receive the same amount of aggregated messages without disconnections. Only publishing seems to have this issue.

(Publishing and subscribing are done by different mqtt.js clients on the backend)

mcollina commented 9 years ago

I've tried setting a keepalive value of 200 (instead of 20) and everything goes 'better'. It seems that because of the stream is being used to publish messages, the ping response is delayed and it does timeout to expire. Isn't it?

I highly recommend to increase the keepalive to 1 minute or more (The default is 60). It is no sense having a value that short (20 seconds). You need to take into account network latency, throughput etc.. So, it is unrelated to the issue I am working on, which is due to the buffers filling up.

I know that MQTT is intended for low bandwidth and usage. But I've a three levels Mosquitto hierarchy and the root of the hierarchy receives the aggregated messages of the rest. Behind this root Mosquitto, there is a node.js backend that send responses to every received message; I was testing whether Mosquitto was able to support a "high" traffic rate or not and the overall behaviour of MQTT hierarchy. Mosquitto supported it, but then mqtt.js diconnection issues happened several times. I did not expect it as we chose Node.js because of its network performance.

I suggest you to move to a 'clustered Mosca configuration', and use the 'published' event to hook into any data processing you might want to do (https://github.com/mcollina/mosca/blob/766e1ca9edf075cfd582cc91afaccd2918d06c4d/README.md#embedded). In this way, you can also have a high-available setup by scaling through multiple machines.

Mosca does not support the bridge protocol (https://github.com/mcollina/mosca/issues/25), but that can be added.

I did not expect it as we chose Node.js because of its network performance.

If you are just publishing a message every 10ms you might be doing something wrong somewhere else. I could easily achieve 10k+ publish/sec on my machine. Mosquitto can handle it pretty well (also on my machine).

You might be having some issues at kernel level, you might need to increase your networks and TCP buffers.

manolodd commented 9 years ago

I highly recommend to increase the keepalive to 1 minute or more. It is no sense having a value that short. You need to take into account network latency, throughput etc..

Ok, thank you. But backend and the broker are in the same local network that is giga ethernet and have an average RTT = 0,5 ms. Throughput is far to its limit, too. However, keepalive for remote clients are greater (two minutes).

I suggest you to move to a 'clustered Mosca configuration', and use the 'published' event to hook into any data processing you might want to do

We analyzed the use of Mosca for our platform, but we need QoS 2 support and Mosca only support QoS 0 and 1. We are designing an end-to-end acknowledged protocol on top of MQTT (that follows point-to-point paradigm) so we need to ensure that packets are delivered and are delivered only once.

We do not need bridging at all, we would prefer a clustering model, but we cannot renounce to QoS 2. It is a must.

And about the subscribing question... why it does not fail although it is configured with the same keepalive? ¿?¿?¿

Thank you again.

mcollina commented 9 years ago

We are designing an end-to-end acknowledged protocol on top of MQTT (that follows point-to-point paradigm) so we need to ensure that packets are delivered and are delivered only once.

It doesn't seem really a good idea. MQTT is pub/sub, if you need point-to-point other protocols make more sense.

And about the subscribing question... why it does not fail although it is configured with the same keepalive? ¿?¿?¿

I have no clue, I can guess that for some reason the subscribing leaves the event loop less full.