cefn commented 8 years ago

Hi again. Thanks for bearing with me as I work through these performance issues.

Trying to better recreate the performance issue from Mosca in our production system in the test suite. The new test I've added makes individual subscriptions to topics. You'll notice I've added a separate process for Mosca to give it a fair run (and running tests with debugging disabled).

Unfortunately this test demonstrates a huge slowdown from multiple individual topic subscriptions (compared to wildcard topics) even on a Mac OS network stack, not quite 100 times slower than Mosquitto. I'm running with a memory persistence as I imagine this would be fastest.

Running with 1000 topics (each with a single retained message each and with max inflight set to 2048 for safety) Mosca takes a total of 123 seconds to receive 1000 retained messages, handle 1000 subscriptions and deliver back the messages against the individual subscriptions, based on the configuration at https://github.com/cefn/stressMQTT/

Exactly the same test running in Mosquitto takes just 1.4 seconds to complete! Is this really down to unoptimised codepaths, even after 1000 messages have passed through? Maybe there's something else I've missed.

Sorry not to have updated the log files in the repo, yet. Must automate that process at some point. However, I also want to get feedback on what optimisations I'm not yet doing before I update the records.

mcollina commented 8 years ago

You should update Aedes, the bug you reported should be fixed now. If not, reopen.

I'm sorry to say, some of the slowdowns are unsolvable here, because I took a wrong architecture years back. As said elsewhere, I will not fix it for this line (0.x.x), again, check Aedes.

As a general rule, here subscribing is slowish. In your setup, looking up retained messages is extremely slow as memory storage is implemented as an 'quick hack' here. In Aedes it is more properly done, but it still uses basic JS objects for storing things.

Use LevelDB persistence for Mosca and Mosquitto db for a fair comparison. Running retained messages without storing to disk is almost useless anyway. Also, disable logs in both.

This particular scenario is definitely out of scope from my goals. Subscribing to 10000 topics is not a normal use case for a MQTT client. Usually a device subscribes to 5-10 topics. Il giorno mar 29 dic 2015 alle 23:47 Cefn Hoile notifications@github.com ha scritto:

Hi again. Thanks for bearing with me as I work through these performance issues.

Trying to better recreate the performance issue from Mosca in our production system in the test suite. The new test I've added makes individual subscriptions to topics. You'll notice I've added a separate process for Mosca to give it a fair run (and running tests with debugging disabled).

Unfortunately this test demonstrates a huge slowdown from multiple individual topic subscriptions (compared to wildcard topics) even on a Mac OS network stack, not quite 100 times slower than Mosquitto. I'm running with a memory persistence as I imagine this would be fastest.

Running with 1000 topics (each with a single retained message each and with max inflight set to 2048 for safety) Mosca takes a total of 123 seconds to receive 1000 retained messages, handle 1000 subscriptions and deliver back the messages against the individual subscriptions, based on the configuration at https://github.com/cefn/stressMQTT/

Exactly the same test running in Mosquitto takes just 1.4 seconds to complete! Is this really down to unoptimised codepaths, even after 1000 messages have passed through? Maybe there's something else I've missed.

Sorry not to have updated the log files in the repo, yet. Must automate that process at some point. However, I also want to get feedback on what optimisations I'm not yet doing before I update the records.

— Reply to this email directly or view it on GitHub https://github.com/mcollina/mosca/issues/385.

cefn commented 8 years ago

Thanks for the guidance. Very helpful. Is there a recommended persistence for Aedes we should be testing for a fair comparison?

I hope to persuade you that the unique capabilities of MQTT combining key-value store with pub-sub (especially given the excellent protocol design decisions on QoS and Persistent Sessions) offer a distinctive and performant implementation, adaptable even to cases where large numbers of keys are being monitored.

There really isn't anything like MQTT, and if it's feasible to implement in ways which don't penalise large numbers that would be a huge bonus and make the ecosystem able to serve a really diverse set of applications.

However I also appreciate open source projects are motivated by the scenarios where people can scratch their own itch, so if I have to jump through hoops to get the performance I want then that's my problem. If Mosca and Aedes have to make tradeoffs to serve their domain of focus, then I totally accept I may have to do work to specialise them for my own domain. It's very useful to have your help so I can understand where I may need to intervene.

While many 'terminal' IOT clients are destined to have very few key pairs, but I'm not sure that's true of any clients who are doing aggregation or control. That will be a function of the total number of clients, which may be very high.

However, our scenario is quite different from the IOT case. We are using MQTT to dispatch events about updates to the data structure of a shared JSON tree on which many hundreds of browser-based, websocket clients each have partial (and uniquely partitioned) views driven by their own logic.

We expect these to be co-located as part of events, but they may in practice be attached over transient connectivity (cell networks in event venues).

Probably the JSON tree maintained by any one client will contain a maximum of few hundred values, but the tree as a whole could represent a few thousand or more.

Using MQTT in the backend means that client code can interactively register an interest in a subtree or leaf, have all their subscriptions routed over a single websocket, and wire updates for their few hundred keys directly into the UI. Network dropouts can be detected and recovered. We can also easily create our own strategy for versioned updates to a document store without deep integration with any broker implementation (this is managed by an MQTT client with a global subscription).

Any change propagating from any other client in the system arrives with the minimum of overhead and on a 'push' basis, allowing the UI to be refreshed in real time as a 'binding' against the shared tree as changes are made by UI events throughout the population of clients.

I've been through two or three redesigns now meaning the client-side code is lightning fast, and has really satisfying metaphors for developers building on the framework, and the potential for scaling up deployment is therefore putting pressure on the MQTT broker.

Spinning up Mosquitto instances programmatically from node is feasible, but not ideal given the whole of the rest of the stack (server right through to client) is javascript on v8. Also there seems to be an inherent polling issue in Mosquitto meaning we may struggle to host large numbers of responsive brokers (the more responsive you make them by increasing polling speed, the more CPU they use).

For that reason even though we're now getting the responsiveness we need from Mosquitto, I'll aim to devote at least a background activity to proving Aedes for our scenario as the Node integration would be a huge bonus, especially with the authentication callbacks providing a really sane security layer for us to do some kind of access control lists for different parts of the JSON tree if we go live as a public-facing service. Look forward to chatting about Aedes issues from here on in.

mcollina commented 8 years ago

When I am saying I am not optimizing it, I mean that I have no time to cover such a use case. However, I will accept contributions and optimizations that make other use cases feasible that do not degrades the primary use case.

While many 'terminal' IOT clients are destined to have very few key pairs, but I'm not sure that's true of any clients who are doing aggregation or control. That will be a function of the total number of clients, which may be very high.

Doing this over thousands of subscriptions is not very scalable. The ideal way of doing this is to plug into the broker, intercept the messages there and do your aggregation/control. In this way, that would be scalable and resilient.

cefn commented 8 years ago

Wanted to verify we were giving Mosca the best chance by running it with levelup as you indicated, can you guide me to a suitable mosca configuration which uses levelup, if it's not already met by the server config in our stressMqtt test suite?

I notice that https://github.com/mcollina/mosca/blob/2694214e230a6b1febfd40fa55c7d33d40876fb0/lib/persistence/memory.js offers up a suitable RAM-hosted Levelup backing DB and seems to be properly referenced by our test mosca config at... https://github.com/cefn/stressMQTT/blob/master/servers/moscaServer.js

If our config doesn't already do the job, it looks like the following from https://www.npmjs.com/package/levelup#intro would give us a raw DB matching our needs, but I can't find a levelup example config for Mosca in the docs...

var levelup = require('levelup')
var memdown = require('memdown')
var db = levelup({ db: memdown })

I am also happy to consider any of the other options (Redis, ZeroMQ etc.) if they might hand off any of the slow operations to another framework. However, I think I understand from your statement...

"I'm sorry to say, some of the slowdowns are unsolvable here, because I took a wrong architecture years back."

...that the fundamentally slow bit is inherently on the Mosca side, so no amount of backend configuration is going to help. If that's right, I'll close off any attempts for backend tuning of Mosca, and focus entirely on Aedes.

cordovapolymer commented 7 years ago

cefn Did you find out how to configure a high-performace mosca setup?

cefn commented 7 years ago

No, @cordovapolymer I followed @mcollina advice that Aedes was the focus of performance and future development, but at the time there were a few problems with Aedes which were terminal for our application.

I therefore employed a Node-scripted Dockerized and lightly-patched Mosquitto configured with Websocket support and a Node http-proxy front-end to route and intercept the Websocket messages to the containerized Mosquitto instances.

Separately I have a daemon process monitoring and archiving topic data to be able to seamlessly redeploy the topic tree of the dockerized Mosquitto on planned or unplanned broker restart, although it's tempting to use native Mosquitto persistence, I wasn't keen to rely upon it.

At some point I would look at Aedes again as Mosquitto is not architecturally woven into our solution (it presents itself simply as a Websocket-based MQTT broker to both the clients, the http-proxy layer and the archiving daemon). At present Mosquitto is serving its purpose, though.

rokka-n commented 7 years ago

@cefn Did you consider to sending messages to kafka for persistence?

cefn commented 7 years ago

Kafka wasn't relevant given the comment about 'wrong architecture' https://github.com/mcollina/mosca/issues/385#issuecomment-167947454 suggested an inherent scalability and responsiveness issue, hence the choice was either Aedes (not stable at the time for our usage) or migrating out of the Node ecosystem altogether for the MQTT broker (using Mosquitto).

Everything except round-tripping in Mosquitto is highly performant and coded in C (but see https://github.com/eclipse/mosquitto/issues/147 as we rolled our own mosquitto which improves round-tripping by polling every millisecond instead of every 100 - a harmless load increase given we can regulate CPU on a per-docker-container basis).

moscajs / mosca

Extended Mosca Test Suite - 8700% slower than Mosquitto ? #385

cefn Did you find out how to configure a high-performace mosca setup?