moscajs / mosca

MQTT broker as a module
mosca.io
3.2k stars 513 forks source link

regex used on looking up retained messages #806

Open SteveAtSentosa opened 4 years ago

SteveAtSentosa commented 4 years ago

Hello,

When looking up the retained message in mongo for a topic being subscribed to, $regex is used

file mongo.js line 270

MongoPersistence.prototype.lookupRetained = function(pattern, cb) {
  var actual = escape(pattern).replace(/(#|\\\+).*$/, '');
  var regexp = new RegExp(actual);
  var stream = this._retained.find({ topic: { $regex: regexp } }).stream();
  ...

The regex is turning out to cause a significant performance issue for us, as we have about 18,000 topics, and the index in the DB is moot due to regex. As clients subcribe to topics over time, this lookup is maxing out our mongo instance at 100% CPU usage due to no index help, and regex logic.

I am wondering if a regex is required here? We are looking up a retained message for a topic and the topic matches exactly without the regex. Is there ever a case where regex is required?

Just for context, the mongo command that ends up being issued is

 { find: "retained", filter: { topic: { $regex: 
   /networks\/5d7fc5fc2262470011c7f117\/pucks\/5bc01d018517360012592764\/active-alerts/ 
} } } 

where the topic is /networks/5d7fc5fc2262470011c7f117\pucks/5bc01d018517360012592764/active-alerts/

Which matches exactly without regex.

Kind Regards Steve

jimmiehansson commented 4 years ago

This makes sense, the scan complexity of the data set is large since there is no workset of the available data. This is essentially causing a projection of searches against matches in an unknown state adding an aggregate to perform the query, resulting in a global lock from that cursor. I would suggest moving out of MongoDB to Redis for this reason unless you will be introducing heavily sharded indices at this point