Rhizome parallelization

I recently found that mdp packets and Rhizome stuff were executed on the same thread. Consequently, audio communications were broken during Rhizome synchronization.

I propose some code making all the Rhizome stuff executed in a separate POSIX thread (only in overlay mode, not necessary elsewhere).

With this code, I successfully exchange Rhizome bundles without impacting audio communications.

Implementation

Here are the principles of my implementation (for more information, read the messages of the firsts commits).

The main idea is to use two fdqueue instances: one "main" and one "rhizome". In overlay mode, a new thread is started, which calls fd_poll() on the rhizome fdqueue.

An alarm (sched_ent) has now a field fdqueue, specifying on which fdqueue the alarm must be scheduled/watched. Thus, the main thread can schedule an alarm to be executed on the rhizome thread (and vice-versa).

Of course, the alarm data (its context) cannot be stack-stored (there is one stack per thread) and these alarms must be allocated dynamically, which implies quite a lot of changes. Commits with message beggining by "Alarm" create a wrapper to be scheduled on an fdqueue. Commits with message beggining by "Schedule" allocate parameters (if needed) and schedule an alarm to be executed on the other thread.

Tests

All tests pass, except FileTransferUnreliableBigMDP. This test sometimes fails even without my changes. If I change the filesize from 1MB to 500kB (in setup_bugfile_common() in tests/rhizomeprotocol), it pass.

Perhaps now is a good time to separate payload encryption / decryption from dependence on "struct overlay_mdp_frame". Then any future services which require more processing time can easily be shifted to a background thread or other processing queue, simply by passing whole "struct overlay_frame"'s around. Then parsing, processing and replying can be done in the background thread without needing to create a new alarm function and context data structure for each use case.

I only added the glue to make the current code multithreaded, without changing the logic. I agree, the missing part of the work is to rewrite some algorithms, when necessary, with multithreading in mind.

We have internally discussed the idea of splitting rhizome into a completely separate process.

Paul also said:

I think in the longer term we want to put it in a separate process to avoid residual inter-thread locks

I read somewhere that you wanted to avoid threads, and use only monothreaded processes. The reasons are unclear to me. Is avoiding residual inter-thread locks the main reason? Could you explain? Both have advantages and disadvantages (random link about it), but I don't know why you want Serval to always avoid threads a priori.

On Tue, Aug 13, 2013 at 7:27 PM, ®om notifications@github.com wrote:

Perhaps now is a good time to separate payload encryption / decryption from dependence on "struct overlay_mdp_frame". Then any future services which require more processing time can easily be shifted to a background thread or other processing queue, simply by passing whole "struct overlay_frame"'s around. Then parsing, processing and replying can be done in the background thread without needing to create a new alarm function and context data structure for each use case.

I only added the glue to make the current code multithreaded, without changing the logic.

Yes I can tell. It's a very tidy series of patches that does nothing more than it claims. Though you have left out the question of logging, another part of the application which uses global variables and would need to be resolved before merging.

I've split overlay_mdp_dispatch such that internal services that only sent packets over the network, and not to local mdp clients, can be rewritten to call overlay_send_frame instead. All of the rhizome related network packets should meet that criteria. Then we just need a single method for passing overlay_frames to the main thread to call this function.

Instead of allocating a new alarm per frame, we can probably build some kind of producer -> consumer pipe using the existing next / prev list pointers in this structure. Or only pass one frame from the background thread at a time, blocking until the main thread has consumed and queued the previous packet. It's a background thread for long running processes, we probably don't care if it stalls when the main thread is busy with latency sensitive tasks. But I wouldn't call myself an expert on writing good multi-threaded C code.

On that point, should we rename the "rhizome" thread to the "background" thread? Just because we're shifting rhizome operations there now, doesn't mean we wont shift other stuff off the main thread later.

I agree, the missing part of the work is to rewrite some algorithms, when

necessary, with multithreading in mind.

We have internally discussed the idea of splitting rhizome into a completely separate process.

Paul also saidhttps://groups.google.com/d/msg/serval-project-developers/kynmhNjv_To/S6cewSRXLYcJ :

I think in the longer term we want to put it in a separate process to avoid residual inter-thread locks

I read somewhere that you wanted to avoid threads, and use only monothreaded processes. The reasons are unclear to me. Is avoiding residual inter-thread locks the main reason? Could you explain? Both have advantages and disadvantages (random link about ithttp://stackoverflow.com/questions/617787/why-should-i-use-a-thread-vs-using-a-process), but I don't know why you want Serval to always avoid threads a priori.

— Reply to this email directly or view it on GitHubhttps://github.com/servalproject/serval-dna/pull/68#issuecomment-22554361 .

Though you have left out the question of logging, another part of the application which uses global variables and would need to be resolved before merging.

Ah? Where are these variables?

I've split overlay_mdp_dispatch such that internal services that only sent packets over the network, and not to local mdp clients, can be rewritten to call overlay_send_frame instead.

Which services do send packets to "local" mdp clients? Which are these "local" mdp clients?

Instead of allocating a new alarm per frame, we can probably build some kind of producer -> consumer pipe using the existing next / prev list pointers in this structure.

Maybe.

But, that way, it could only apply for passing one frame from one thread to another.

My idea was to pass "runnables" (a generic function+argument to post whatever action you want). In practice, the alarms I scheduled do not always post frames (see parallel.h *_arg structures). I don't know if all these action arguments could be reduced as "frame data".

Or only pass one frame from the background thread at a time, blocking until the main thread has consumed and queued the previous packet. It's a background thread for long running processes, we probably don't care if it stalls when the main thread is busy with latency sensitive tasks.

I don't know if the overrhead of these malloc/free calls is significant or not.

The way I've implemented it uses the same mechanism both for main thread and rhizome thread. As a consequence, if rhizome blocks waiting for main thread to be idle, then main thread will also block waiting for rhizome thread to be idle. The situation where the main thread needs to post a runnable on the rhizome thread occurs (1 2 3), but maybe it can be avoided…

On that point, should we rename the "rhizome" thread to the "background" thread? Just because we're shifting rhizome operations there now, doesn't mean we wont shift other stuff off the main thread later.

I've considered this background thread to be rhizome-specific: another service would have its own thread too… Although, even a single service could have several threads.

We've considered turning rhizome into an mdp client.

I think it is a good idea.

Ideally, I think Rhizome could simply work as any other service on top of MDP: it would open an MDP socket on a predefined port and exchange with other peers, without any lower-level knowledge and, above all, without being referenced by any lower-level code.

This would remove the need of "internal services" hack: each service would use its own port dynamically (like with TCP or UDP).

In that case, Rhizome would create its own thread to handle its stuff separately, without impacting overlay* code.

What do you think?

servalproject / serval-dna

Rhizome parallelization #68

Implementation

Tests