max-niederman / centipede

Centipede is a work-in-progress multipathing VPN for improving smartphone Internet connection reliability and performance.
7 stars 1 forks source link

Worker threads crash while connecting two different machines. #1

Open kaowul opened 4 months ago

kaowul commented 4 months ago

Is it possible for this project to consider integration with mp-quic?

max-niederman commented 3 months ago

There's a configuration example for a simple two peer network in peer1.toml and peer2.toml.

As for integration with MP-QUIC, I'm not sure what you have in mind exactly. Centipede operates over message-oriented protocols like raw UDP rather than stream-oriented protocols like QUIC.

kaowul commented 3 months ago

I used two Ubuntu 22.04.3 LTS machines to build and conduct ping tests successfully. When using iperf3 for network speed testing, I encountered the following error upon starting iperf -c centipede.

./target/debug/centipede peer2.toml 2024-04-24T16:03:57.878Z INFO centipede > spawned 2 worker threads 2024-04-24T16:03:58.079Z INFO centipede_control > received initiation acknowledgement from Gb1SnzVaODF8LHKoBD+dJtS/e/9GYEhnLX0sOk8OIbA= × worker thread 1 failed ├─▶ failed to write to UDP socket ╰─▶ Resource temporarily unavailable (os error 11)

2024-04-24T16:04:09.207Z INFO centipede > shutting down due to error 2024-04-24T16:04:09.210Z INFO centipede > received shutdown signal, waiting for workers to finish...

kaowul commented 3 months ago

I used two VPS instances, and direct iperf speed tests were successful. Further testing revealed that when running "iperf3 -c 10.0.10.1 -u -b 500M", an error occurred, while "iperf3 -c 10.0.10.1 -u -b 300M" resulted in a successful connection to centipede. There were no significant abnormalities observed in CPU usage.

max-niederman commented 3 months ago

Hmm, this is very weird. I've beenable to reproduce the same error on some local machines. It doesn't seem to happen between two netnses on the same machine, which is how I've been testing so far.

I'll keep looking into it. I suspect this could have something to do with device queues filling up.

max-niederman commented 3 months ago

Should be fixed in the latest commit. @kaowul would you mind testing this to verify?

kaowul commented 3 months ago

iperf3 -c 10.0.10.1 Connecting to host 10.0.10.1, port 5201 [ 5] local 10.0.10.2 port 52038 connected to 10.0.10.1 port 5201 [ ID] Interval Transfer Bitrate Retr Cwnd [ 5] 0.00-1.00 sec 42.8 MBytes 359 Mbits/sec 42 177 KBytes
[ 5] 1.00-2.00 sec 41.6 MBytes 349 Mbits/sec 10 192 KBytes
[ 5] 2.00-3.00 sec 41.4 MBytes 348 Mbits/sec 14 205 KBytes

This was my recent test, which confirms that the link speed between my two VPNs is around 350Mbits/sec. However, if I conduct a test using iperf3 -c 10.0.10.1 -u -b 400M, after a few seconds, the following error occurs:

./target/debug/centipede /root/peer2.toml 2024-04-25T06:58:43.580Z INFO centipede > spawned 2 worker threads 2024-04-25T06:58:43.742Z INFO centipede_control > received initiation acknowledgement from Gb1SnzVaODF8LHKoBD+dJtS/e/9GYEhnLX0sOk8OIbA= × worker thread 0 failed ├─▶ failed to poll for events ╰─▶ Resource temporarily unavailable (os error 11)

2024-04-25T06:59:49.876Z INFO centipede > shutting down due to error 2024-04-25T06:59:49.885Z INFO centipede > received shutdown signal, waiting for workers to finish...

kaowul commented 3 months ago

test

This time when I ran iperf3 -c 10.0.10.2 -u -b 600m, an error occurred. It seems that this error occurs when the data sent exceeds the receiving capacity of the destination.