moquette-io / moquette

Java MQTT lightweight broker
http://moquette-io.github.io/moquette/
Apache License 2.0
2.31k stars 818 forks source link

Heavy memory and CPU usage and stuck threads #119

Closed bardyamomeni closed 8 years ago

bardyamomeni commented 8 years ago

Hi

I am getting a lot of problems, I just wanted to let you know. Lot of threads are stuck in TreeNode's activate(), deactivate, copy() and findAllByClientID() methods. And a lot of objects are created, instances of Subscription class.

My servers have high load, I expect a lot of connections per second.

trace

capture

I hope the screenshots help resolving this issue

andsel commented 8 years ago

Hi @bardyamomeni for your issue. The screenshots help partially. I need more stuff to create a test that reproduce your case. One great thing (if you can provide me) is the full moquette log, at INFO or DEBUG levels. The other is a desription of your use case, for example how many clients and how many subscriptions are present in Moquette, and a summary of the MQTT flow each client has with the broker.

Best regards Andrea

bardyamomeni commented 8 years ago

Hi @andsel . Thank you for your reply. I'll tell the situation which this happens. I have to java applications, one is a server and one is a client (although there are many clients). the broker is configured with two credentials, 'root' which can subscribe to topics and etc (full broker access) and a 'public' than can only connect. The server application connects to broker with root credential but the client's client ID, subscribes to the required topic and then disconnects from broker, then the address of the broker is passed to the client using a previously made socket connection. At this point the user is able to connect to the broker with 'public' credentials and know it is subscribed with all the topics it must.

Now I have about 20 to 30 types of this request per second for the broker and my server to process. It seems that the huge amount of subscriptions with different client ID's make the subscription tree huge, and it's processing time long.

I want to help and contribute in moquette project too. I am now using mosquitto 1.4.4 for my broker and the it is stable but the problem I have with mosquitto is that it seems to forget to send last will messages that I use to know that a client is disconnected from the broker.

I think the problem raises when too many subscriptions per second are made. the tree node's activate and copy() method plays a huge role in this performance loss. but that's just my opinion.

I will gather more data as you requested and will post another comment.

Thank you again ;)

andsel commented 8 years ago

Hi @bardyamomeni I'm working on refactoring the subscription tree and the storage part for sessions, lowering a lot the pressure on the tree for activate/disactivate (the previous implementation was poor)

bardyamomeni commented 8 years ago

Hi @andsel That's great to know. I am working on a project that is using MQTT as part of its process. The MQTT broker has a lot of pressure in my designed flow. I replace your subscription store implementation with what is optimized with my needs. My solution is super simple because it doesn't need any {#} or {+} subscriptions. Tomorrow I'll send my implementation of the subscription store, maybe it could help. It uses Maps primarily to activate client's subscriptions. If you want to test your implementation I would be grateful to help, I can make about 1 million connects (and about 200K subscription requests) in about 24 hours with real clients. I can provide test results and profiling data for you. And great work with moquette project. It's my broker of choice.

fjcyue commented 8 years ago

I hava a very similar use case. when thousands of client connected, the jvm heap is quite large (about 1G for 1000 clients) what's the road map of subscription refactoring?

bardyamomeni commented 8 years ago

Hi @andsel I developed my solution of your SubscriptionStore class. Please check it out and let me know. just download it and rename it to ,java. Mapper.txt

andsel commented 8 years ago

HI, I've changed a little bit the storage part and the way the broker activate/disactivate the subscriptions in the tree, you work is aligned with latest stuff from Moquette master, if yes could you describe in more detail how you fixed the problem.

Andrea

On Sat, Nov 21, 2015 at 3:00 PM, bardyamomeni notifications@github.com wrote:

Hi @andsel https://github.com/andsel I developed my solution of your SubscriptionStore class. Please check it out and let me know. just download it and rename it to ,java. Mapper.txt https://github.com/andsel/moquette/files/40839/Mapper.txt

— Reply to this email directly or view it on GitHub https://github.com/andsel/moquette/issues/119#issuecomment-158643836.

bardyamomeni commented 8 years ago

Hi again @andsel My solution (which is draft right now and can be considered a quickfix) is heavily depends on hash maps. as you can see in the Mapper class that I sent you, the main data structure is a Map<String,Map<String,Subscription>>.

The outer Map is keyed with topic filters and the inner Map is keyed with the client ID. As for multi-hierarchy topics, when a publish receives at the broker, the broker must search for at least 2^n + 2*n + 1 topics that the publish matches. for example if a publish is received for a topic like a/b/c all of these subscriptions below will be valid to receive the payload: 1) a/b/c 2) a/b/+ 3) a/+/c 4) a/+/+ 5) +/b/c 6) +/b/+ 7) +/+/c 8) +/+/+ 9) a/b/# 10) a/# 11) # 12) +/# 13) +/+/# 14) a/+/# 15) +/b/#

It is visible that there are too many topics to search for, so I used Hashmaps as the base class to hold the subscription and a nested hashmap for each entry to search even faster with a client ID for a specific topic.

Because my requirement is to hold the data in memory for speed (not using the map DB), I read your MemoryPersistance class and changed it.

andsel commented 8 years ago

If this path works for you it's great! I see some problems:

I don't think this is good patch for the Moquette broker.

Andrea

bardyamomeni commented 8 years ago

@andsel There is no problem with downgrading to java 7. as for the convolution, we can filter out topic filters prematurely. for example there could be tree that we can check for presence of topics. Lets call this tree TopicPresenceTree. for example we have a method in this class called "boolean checkPresence(int level, String token)" that returns true if a specific token is present in subscriptions store at the specific topic hierarchy level. Consider my last post, if we use this method like below:

if(topicPresenceTree.checkPresence(1,"+"){
  // continue searching for eligible clients that have a subscription 
  // with "+" in it at topic hierarchy level 1
} else {
 // No Op
}

the statement will search a tree to find if there are any "+" token subscribed at level 1 of all subscriptions, then we can filter out about half of the 15 valid topics. This could be a minor optimization.

Please let me know what you think

Bardya

fjcyue commented 8 years ago

I do a minor change to fix the OOME in my usecase. I have 10,000+ clients subscribe the same topic, if there is a newcomer ,the full tree is copied when addSubscription I modified the copy() method in TreeNode, replace the copy.m_subscriptions.add(new Subscription(sub)) with copy.m_subscriptions.add(sub) since the Subscription object is immutable after construction

BTW: search of Subscription is a bit slow, maybe it's better of using Map instead of List in TreeNode

andsel commented 8 years ago

The problem was due to bad cleaned reference to parent's node, so that nothing was ever cleaned + excessive dumpTree that generated a lot of consupmtion.