multipath-tcp / mptcp_net-next

Development version of the Upstream MultiPath TCP Linux kernel 🐧
https://mptcp.dev
Other
279 stars 41 forks source link

Round-Robin Packet Scheduler Support #517

Closed Anankke closed 3 weeks ago

Anankke commented 3 weeks ago

Ref: #194

The round-robin packet scheduler deserves reconsideration for inclusion in MPTCP. While it may be seen as suboptimal in typical use cases, its value becomes evident in specific, challenging network environments—particularly those under heavy censorship, where alternative schedulers may not suffice.

One such environment is mainland China, where the Great Firewall (GFW) employs sophisticated mechanisms capable of reconstructing and analyzing TCP flows. In these conditions, the round-robin scheduler in MPTCP could play a pivotal role in evading such censorship. By distributing packets across multiple subflows, the round-robin scheduler makes flow reconstruction substantially more difficult, thus increasing the chances of bypassing traffic inspection and censorship.

Consider a scenario where a user connects via a VPS hosted on an unrestricted network, utilizing two IP addresses from the same datacenter. The round-robin scheduler would distribute packets across both IPs, effectively obfuscating traffic patterns and hindering the GFW’s ability to detect and block content. In cases like this, users could maintain access to uncensored information, making the round-robin scheduler indispensable in censorship-heavy environments.

Although its benefits might not be optimal in ordinary network conditions, it is essential to recognize its critical role in edge cases. When subflows exhibit near-identical network characteristics, the round-robin scheduler performs efficiently and offers a viable option to evade censorship. The claim that the scheduler is “useless” overlooks these important use cases, where network conditions are anything but ideal.

Historically, the out-of-tree MPTCP round-robin scheduler has empowered many users to bypass censorship barriers. Unfortunately, this functionality has been lost with the transition into the kernel tree. Given its demonstrated utility in restrictive environments, I strongly advocate for the reinstatement of the round-robin packet scheduler.

This request is more about ensuring access to free information under adverse conditions. Reinstating the round-robin scheduler could provide a vital tool for users in censored regions, enabling them to communicate and access information freely.

matttbe commented 3 weeks ago

Hello,

Thank you for this ticket.

The round-robin packet scheduler from the out-of-tree MPTCP kernel was designed as a toy, just to easily check it was possible to create new packet schedulers in this kernel. To be honest, that's the first time I hear people were using it for something useful. (But still, I'm surprised to see this technique being used as an obfuscating traffic, instead of using encryption. Is it because encryption is blocked?)

Please note that the #194 ticket has been closed, not because we don't want to have any round-robin packet schedulers, but in favour of #75: the idea is to have new custom packet schedulers implemented in BPF. The work is still in progress, but there is already a round-robin packet scheduler implemented in BPF, see here. It is still in WIP, because the API is not ready yet: for example, this RR packet scheduler is limited to do a round-robin scheduler per burst of data, not per packet as it should. Hopefully, this will be fixed soon, and allow new packet schedulers to be implemented with BPF.

Anankke commented 3 weeks ago

Thank you for the prompt and detailed reply.

I really appreciate the clarification on ongoing efforts with BPF-based schedulers. It’s great to know that this feature is being worked on and I look forward to the eventual per-packet round-robin scheduler implementation.

instead of using encryption. Is it because encryption is blocked?

To address your question: Yes, encryption protocols such as TLS or Shadowsocks are used widely, but unfortunately, the Great Firewall (GFW) has advanced to the point where machine learning-based fingerprint analysis can usually identify and throttle or block encrypted traffic altogether. This is one of the reasons why dispersing encrypted traffic across multiple IPs or subflows is an effective complementary strategy to avoid detection. By distributing traffic in this way, it becomes significantly harder for the GFW to reconstruct flows and apply its censorship mechanisms effectively. This is particularly valuable when encrypted connections are flagged or outright blocked.

There used to be another technique fiddling with window size, during the handshake when the client connects to the server, the server would initially send back a smaller window size. This forced the client to split the handshake packet into two, avoiding TCP header detection. However, as the GFW's reassembly capabilities improved and dynamic blocking based on machine learning was implemented, this method became less effective. Not to mention that this approach also negatively impacts TCP's speed and scheduling.

Once again, thank you for the contribution and considering the diverse needs of users in censorship-heavy environments. I’m eager to see how this evolves, and I would be happy to assist with testing or providing further feedback as the work progresses.

matttbe commented 3 weeks ago

Thank you for the explanations! That's a shame such techniques are needed to access the Internet, but that's nice creative methods exist :)

Please note that it is easy to identify MPTCP, force a fallback to TCP, and even reconstruct the flow if someone has access to all the connections. But as long as nothing against it is done, that's good!

I suggest closing this ticket in favour of #75. TL;DR: a round-robin packet scheduler is being implemented in BPF.

Anankke commented 3 weeks ago

But as long as nothing against it is done, that's good!

Absolutely, as long as our technology and methods remain ahead of the GFW’s evolving techniques, we can continue to provide people with uncensored access to the internet. Additionally, MPTCP has played a crucial role in significantly improving connection stability in such restrictive environments, which only reinforces its importance.

Regarding the round-robin scheduler, do you have any estimated timelines or ETAs for when we might expect the per packet BPF-based implementation to be ready for broader use?

matttbe commented 3 weeks ago

Additionally, MPTCP has played a crucial role in significantly improving connection stability in such restrictive environments, which only reinforces its importance.

I didn't know, that's good!

Regarding the round-robin scheduler, do you have any estimated timelines or ETAs for when we might expect the per packet BPF-based implementation to be ready for broader use?

No sorry. Not for v6.12. Hopefully we can start doing stuff in v6.13, maybe in v6.14.