BLocking ESTimation-based Scheduler (BLEST)

dweb32 commented 5 years ago

This pull request adds the BLocking ESTimation-based Scheduler (BLEST) as a module to MPTCP v0.94.

This scheduler works much like the default MPTCP scheduler. It always prefers the subflow with the smallest round-trip-time that is available. If the network paths are very heterogeneous, then BLEST can be more beneficial than the default scheduler because it estimates if sending on a "slower" subflow would cause head-of-line blocking and decides to wait until the "faster" subflow becomes available again.

BLEST was initially proposed and evaluated in this scientific publication:

Ferlin, S., Alay, Ö., Mehani, O., and Boreli, R. BLEST: blocking estimation-based MPTCP scheduler for heterogeneous networks. In 2016 IFIP Networking Conference, Networking 2016 and Work- shops, Vienna, Austria, May 17-19, 2016 (2016), IEEE, pp. 431–439.

This pull request is based on the implementation that was published by the original authors for MPTCP v0.89 here: https://bitbucket.org/blest_mptcp/mptcp-paper-ifip2016

I ported the code to MPTCP v0.94 and and tried to align it as close as possible to the official MPTCP project.

Note, that BLEST in this implementation does not consider if a subflow is assigned to backup state!

dweb32 commented 5 years ago

I have applied your feedback and rebased the pull request onto mptcp_trunk. I needed to force push my previous commits in order to resolve some conflicts in tcp_input.c. The original commit introduced mptcp_set_ca_state in tcp_input.c which was not used by BLEST and removed by my later commits causing the trouble.

I don't know why GitHub is still showing merge conflicts on this page, but now it should be correctly rebased. It would be great if you could re-review the code again.

cpaasch commented 5 years ago

Your PR doesn't compile anymore. We changed mptcp_for_each_sk to mptcp_for_each_sub with different arguments. Please test your change as well.

dweb32 commented 5 years ago

I'm sorry for overlooking the changes after rebasing the PR. I just fixed the wrong calls in mptcp_blest.c. But I will need some time to apply the remaining changes.

At this point I think I should also rewrite blest_get_available_subflow to become aware of backup-subflows as well. @sferlin, do you have any suggestions, if this would change the original algorithm design or semantics of the scheduler? Otherwise, adopting get_subflow_from_selectors from mptcp_sched.c should be fine to apply BLEST first on all active-subflows before going through the backup-subflows. In both cases, we just need to find the "global-best" and the "current-best" sks to compare their stats and decide whenever to return NULL instead of the "current-best" sk.

dweb32 commented 5 years ago

In the meantime, I have taken a closer look at the requested changes and also tried to refactor the original BLEST implementation to align better with the default scheduler. Now I got stuck and need some advice or general design decision how to rewrite BLEST such that it fits nice into MPTCP.

The first thing to notice is that although I already removed as much redundant code as possible, there are still many code lines that are identical to mptcp_sched.c. This isn't very surprising, because BLEST operates exactly like minRTT in most of the cases. Thus, these functions are very similar to the default scheduler:

mptcp_blest_next_segment
__mptcp_blest_next_segment
mptcp_blest_rcv_buf_optimization
blest_get_available_subflow
blest_get_subflow_from_selectors

Do you have any advice, how to get rid of this redundancy? The only true functional difference is in blest_get_subflow_from_selectors. One possible solution could be to add a parameter to the default scheduler to optionally enable the blocking estimation algorithm. Unfortunately, this would somehow contradict the current modular design of the schedulers.

An other problem is the size of the priv, but I don't have a solution to be more efficient yet. Maybe we split blestsched_priv into two structs (like it is already done in mptcp_redundant.c), one per subflow and one as control block for the scheduler. As of now, we have all these in one priv:

        u32 last_rbuf_opti;
    u32 min_srtt;
    u32 max_srtt;

    int lambda_1000;
    u32 last_lambda_update;
    u32 retrans_count;

I updated the PR just to discuss further changes to the code. But it won't compile at the moment with the smaller MPTCP_SCHED_SIZE. The last commit also includes changes in blest_get_available_subflow to be aware of backup-subflows.

cpaasch commented 5 years ago

In the meantime, I have taken a closer look at the requested changes and also tried to refactor the original BLEST implementation to align better with the default scheduler. Now I got stuck and need some advice or general design decision how to rewrite BLEST such that it fits nice into MPTCP.

The first thing to notice is that although I already removed as much redundant code as possible, there are still many code lines that are identical to mptcp_sched.c. This isn't very surprising, because BLEST operates exactly like minRTT in most of the cases. Thus, these functions are very similar to the default scheduler:

mptcp_blest_next_segment

__mptcp_blest_next_segment

mptcp_blest_rcv_buf_optimization

blest_get_available_subflow

blest_get_subflow_from_selectors

Do you have any advice, how to get rid of this redundancy? The only true functional difference is in blest_get_subflow_from_selectors. One possible solution could be to add a parameter to the default scheduler to optionally enable the blocking estimation algorithm. Unfortunately, this would somehow contradict the current modular design of the schedulers.

I am fine with having some code-duplication in blest and keep the default-scheduler untouched.

If BLEST is getting more and more popular, it can be refactored later on.

An other problem is the size of the priv, but I don't have a solution to be more efficient yet. Maybe we split blestsched_priv into two structs (like it is already done in mptcp_redundant.c), one per subflow and one as control block for the scheduler. As of now, we have all these in one priv:

Yes, splitting is probably the way to go.

        u32 last_rbuf_opti;
  u32 min_srtt;
  u32 max_srtt;

  int lambda_1000;
  u32 last_lambda_update;
  u32 retrans_count;
I updated the PR just to discuss further changes to the code. But it won't compile at the moment with the smaller MPTCP_SCHED_SIZE. The last commit also includes changes in blest_get_available_subflow to be aware of backup-subflows.

dweb32 commented 5 years ago

Finally, I'm back in the lab and have found some time to apply the requested changes and do some testing. I'm sorry it took so long to update the PR.

I was able to refactor the original BLEST code. Now, the MPTCP_SCHED_SIZE no longer needs to be increased because I splitted the blestsched_priv into two parts and placed bestsched_cb into the mpcb. Both structs are smaller and do fit. Thus, the changes to mptcp_ctrl.c were removed since this fix with the extra allocation isn't necessary any more.

In addition, I also removed some code of BLEST by directly calling get_available_subflow of the default scheduler instead.

matttbe commented 5 years ago

Hi @s6dlwebe ,

That's good the changes in the other files are now minimum! It will ease the integration here! I hope Christoph will have time to finish the review :)

If you have a bit more time, may you rebase your branch on latest origin/mptcp_trunk and launch checkpatch.pl script to uniform the code?

./scripts/checkpatch.pl --strict <path/to/your/patch>

(you can ignore long lines, we usually try to limit the line below ~100 chars, not the default 80)

dweb32 commented 5 years ago

If you have a bit more time, may you rebase your branch on latest origin/mptcp_trunk and launch checkpatch.pl script to uniform the code?

I rebased the pull request and fixed some of the styling errors. Here are the remaining problems (not including the line limits). These lines are identical with mptcp_sched.c, so i did not modify them.

./scripts/checkpatch.pl --strict -f net/mptcp/mptcp_blest.c 

CHECK: Avoid CamelCase: <TCP_CA_Open>
#299: FILE: net/mptcp/mptcp_blest.c:299:
+           if (tp->srtt_us < tp_it->srtt_us && inet_csk((struct sock *)tp_it)->icsk_ca_state == TCP_CA_Open) {

WARNING: Missing a blank line after declarations
#319: FILE: net/mptcp/mptcp_blest.c:319:
+       bool do_retrans = false;
+       mptcp_for_each_sub(tp->mpcb, mptcp) {

WARNING: else is not generally useful after a break or return
#332: FILE: net/mptcp/mptcp_blest.c:332:
+                   break;
+               } else {

total: 0 errors, 32 warnings, 1 checks, 497 lines checked

cpaasch commented 5 years ago

Looks good to me! @matttbe, you want to give your thumbs up as well?

matttbe commented 5 years ago

@cpaasch : I hope you don't mind if I merge this. You already said it was OK for you ;-)

@s6dlwebe again, thank you for your work!

dweb32 commented 5 years ago

@matttbe I noticed that the commit signature is now broken since you pushed to mptcp_tunk. Is this right?

matttbe commented 5 years ago

@s6dlwebe yes, that's because I asked Github not to create a merge commit (we don't do that for MPTCP-related changes) but to rebase on top of mptcp_trunk. I thought your commit was already on top of mptcp_trunk or maybe it was but it seems Github modified it.

Next time you create a patch, I will apply it like I usually do when patches come from the ML and sign it.

multipath-tcp / mptcp

BLocking ESTimation-based Scheduler (BLEST) #300