ntop / nDPI

Open Source Deep Packet Inspection Software Toolkit
http://www.ntop.org
GNU Lesser General Public License v3.0
3.8k stars 892 forks source link

I suggest use red-black tree instead of binary search tree to store the flows. #2363

Open AlexanderDymov opened 6 months ago

AlexanderDymov commented 6 months ago

Hello guys,

I would like to ask a question: The binary search tree (BST) in which flows are stored is not balanced. The complexity analysis of BST shows that, on average, the insert, delete and search takes O(log n) for n nodes. But in the worst case, they degrade to that of a singly linked list: O(n).

When processing each packet, 3 search operation in the tree are performed (2 calls to ndpi_tfind and 1 call to ndpi_tsearch). This can cause large delays if the number of flows is large (> 1 million) and the tree is degenerate into a list.

The question is: Why don't you use a balanced tree like a red-black tree where the worst case search cost is O(log n)?

Thank you for any hint

IvanNardi commented 6 months ago

You analysis is correct. However there is a missing point: nDPI (the library) doesn't do any flow management; this (important and critical) topic is left to the application. I am sure that any serious applications linking to nDPI is using a proper and efficient data structure for keeping flows information, tailored to the specific application needs. ndpiReader is only a basic example, aiming to show nDPI capabilities and how to use nDPI API: performance is not its goal. I am not sure because I was not involved with the project at the time, but it is quite likely that ndpiReader is using a simple tree only because that data structure was easily available at the time, or it was easy to implement.

Having said that, any effort from the community to improve (also) ndpiReader performance is welcomed :)

AlexanderDymov commented 6 months ago

Hi, Ivan!

Thank you for you answer.

I will research this problem and possible solutions.

Thank you!

mmanoj commented 2 months ago

@AlexanderDymov

Any update on your research?