kohler / click

The Click modular router: fast modular packet processing and analysis
Other
740 stars 321 forks source link

RCU Based HashTable and Corresponding Element #364

Open nmcglo opened 7 years ago

nmcglo commented 7 years ago

Current HashTable requires some synchronization mechanism for multithreaded access. New HashTable_RCU provides a lockless threadsafe hashtable modeled after Click's HashTable to allow for simultaneous multiple readers and a single writer. (internal writer synchronization is still a WIP).

This locklessness is achieved through Read-Copy-Update (RCU) to ensure valid, although potentially stale, data to all readers without needing to ever wait to access the structure. Writers will only need to wait for other writers to finish before modifying.

I've created an element that utilizes this data structure as well. This element counts ICMP packets hashed on the incoming IP address. These counts are stored in a HashTable_RCU instance and acts as an example for safe usage of the structure.

bcronje commented 7 years ago

Just to confirm, this is for kernel mode only right?

Assuming that is the case, I wonder where hashtable_rcu.hh should live, as typically anything under include/click is usable by either user or kernel modes. Maybe this calls for include/click/linuxmodule similar to how we have elements/linuxmodule. @kohler would be the person to make this call.

What would be required to make this compatible in user mode? I know there are a number of user mode RCU libraries like https://github.com/urcu/userspace-rcu , would be interested to know if hashtable_rcu could be used with something like that?

That said, looks really interesting. Do you have any performance comparisons between HashTable and HashTable_RCU in both single and multiple threads (using Spinlock or ReadWriteLock in case of HashTable)?

nmcglo commented 7 years ago

You're correct, kernel mode only at this point. And I actually initially prototyped it up in userspace using Liburcu (that library you mentioned) a few months ago. However currently the core bucket structure is built using linux hlists. Liburcu has things similar to hlist and its associated macros but are obviously named differently. But there's nothing that would prevent this from being ported though.

Unfortunately I don't have a working multithreaded click instance yet to obtain quantitative data.

tbarbette commented 7 years ago

This is very great and a definitely good step in the right direction. Both RCU and multi-thread safe HashTable are missing since a long time. I was looking into RCU integration in Click to protect the many element handlers that update >64bit data structures in MT context.

Personally, I'm facing panic problems with Kernel multi-thread. I think performance review is a must before merging. I'm a bit worried about how RCU writes will scale with many RCU in the fastpath, could the deferred destruction stack up to a point where it introduce noticeable jitter? Is a counter the right element for RCU? It's more write than read, no?

I think, given recent advances (/ non-advances) of kernel mode, having a solution working for both modes would be better. Even if it is only for testing purposes. This kind of datastructure needs a test or two... As you probably saw, the commits are also breaking userlevel as of now.

I looked a little at the integration of liburcu recently. It doesn't seem hard at all, the QSBR would suit Click simply adding the call to rcu_quiescent_state() in the routerthread loop leading to something very similar to Linux's behaviour.

nmcglo commented 7 years ago

I agree that performance review is definitely a priority. I'm working on getting it running on a linux machine at home to work out the kinks. I no longer have access to the codebase or machines that this was running on so there will be a little downtime whilst I spin up my own click instance.

The counter is decidedly not the best case element for RCU. I started on this 12 weeks ago knowing nothing about Click so I just picked an element that could be easily described in 20 words and anyone could understand that could utilize a hashtable as a POC. That being said, you can use the IPCounter element as a read heavy structure if you ping it from the same IP addresses - as modifying the values of items already in the table are technically read operations.

Adding userspace functionality to this wouldn't be hard, really just need to modify the bucket storage method to use userspace lists and import liburcu. The library even includes a comparable structure to linux's h_list.