Socket rebind: drain old socket

nemethf commented 2 months ago

During a planned connection migration allow the client to receive trafic via the old, abandoned socket until the first packet arrives on the new socket.

I wrote this PR in a "cargo cult"-style without really knowing what I do. It seems to work as the client starts to acknowledge stream frames of the old path on the new path. However, the server does not send stream frames right after it notices the new path even if I modify the server to process acknowledgements when there's an outstanding path challenge here.

Packet No. 1828, below, is the first packet arriving to the server on the new path. The server sends path-challenges in 1891 (allowing linkability) and 1893. The path validation for the new path arrives in 1910. The server sends the first stream frame on the new path in 1913 with a repeated path challenge, so I assume it haven't processed the path response frame yet.

I'd like to see stream frames as early as packet 1893. What should I do? Thank you.

This is the server-side packet trace:

#pkt time     src      dst            len
1820 6.204678 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1234, STREAM(3)
1821 6.205635 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 598, ACK_ECN
1822 6.207027 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 599, ACK_ECN
1823 6.208060 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 600, ACK_ECN
1824 6.209568 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 601, ACK_ECN
1825 6.211314 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 602, ACK_ECN
1826 6.213122 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 603, ACK_ECN
1827 6.215704 10.0.1.1 10.0.3.1 QUIC   84 DCID=0b28501a55611291, PKN: 604, ACK_ECN
1828 6.221486 10.0.2.1 10.0.3.1 QUIC   78 DCID=b97c2e5111f44552, PKN: 605, PING, RC, PADDING
1829 6.229023 10.0.3.1 10.0.1.1 QUIC  834 DCID=65cc97c40b1817d1, PKN: 1235, STREAM(3)
1830 6.229031 10.0.3.1 10.0.1.1 QUIC 1500 DCID=65cc97c40b1817d1, PKN: 1236, STREAM(3)
1831 6.230494 10.0.3.1 10.0.1.1 QUIC  918 DCID=65cc97c40b1817d1, PKN: 1237, STREAM(3)
1832 6.232322 10.0.3.1 10.0.1.1 QUIC  834 DCID=65cc97c40b1817d1, PKN: 1238, STREAM(3)
1833 6.233964 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1239, STREAM(3)
1834 6.234850 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1240, STREAM(3)
1835 6.235388 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1241, STREAM(3)
1836 6.235915 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1242, STREAM(3)
1837 6.236435 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1243, STREAM(3)
1838 6.237339 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1244, STREAM(3)
1839 6.237523 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 606, ACK_ECN
1840 6.238242 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1245, STREAM(3)
1841 6.239146 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1246, STREAM(3)
1842 6.240050 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1247, STREAM(3)
1843 6.240954 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1248, STREAM(3)
1844 6.241858 10.0.3.1 10.0.1.1 QUIC  646 DCID=65cc97c40b1817d1, PKN: 1249, STREAM(3)
1845 6.241887 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 607, ACK_ECN
1846 6.243158 10.0.3.1 10.0.1.1 QUIC  646 DCID=65cc97c40b1817d1, PKN: 1250, STREAM(3)
1847 6.243241 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 608, ACK_ECN
1848 6.244286 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 609, ACK_ECN
1849 6.244419 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1251, STREAM(3)
1850 6.245322 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1252, STREAM(3)
1851 6.245769 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 610, ACK_ECN
1852 6.246226 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1253, STREAM(3)
1853 6.247131 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1254, STREAM(3)
1854 6.247890 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 611, ACK_ECN
1855 6.248035 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1255, STREAM(3)
1856 6.248936 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 612, ACK_ECN
1857 6.248939 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1256, STREAM(3)
1858 6.250358 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 613, ACK_ECN
1859 6.252547 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 614, ACK_ECN
1860 6.254742 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 615, ACK_ECN
1861 6.270770 10.0.3.1 10.0.1.1 QUIC 1221 DCID=65cc97c40b1817d1, PKN: 1257, ACK_ECN, STREAM(3)
1862 6.270803 10.0.3.1 10.0.1.1 QUIC 1500 DCID=65cc97c40b1817d1, PKN: 1258, STREAM(3)
1863 6.273014 10.0.3.1 10.0.1.1 QUIC 1294 DCID=65cc97c40b1817d1, PKN: 1259, STREAM(3)
1864 6.275579 10.0.3.1 10.0.1.1 QUIC 1210 DCID=65cc97c40b1817d1, PKN: 1260, STREAM(3)
1865 6.277988 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1261, STREAM(3)
1866 6.278874 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1262, STREAM(3)
1867 6.279219 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 616, ACK_ECN
1868 6.279789 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1263, STREAM(3)
1869 6.280699 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1264, STREAM(3)
1870 6.281603 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1265, STREAM(3)
1871 6.282122 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1266, STREAM(3)
1872 6.282454 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 617, ACK_ECN
1873 6.283022 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1267, STREAM(3)
1874 6.283921 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1268, STREAM(3)
1875 6.284825 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1269, STREAM(3)
1876 6.284937 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 618, ACK_ECN
1877 6.285729 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1270, STREAM(3)
1878 6.285981 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 619, ACK_ECN
1879 6.286633 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1271, STREAM(3)
1880 6.287428 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 620, ACK_ECN
1881 6.287538 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1272, STREAM(3)
1882 6.288064 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1273, STREAM(3)
1883 6.288969 10.0.3.1 10.0.1.1 QUIC  646 DCID=65cc97c40b1817d1, PKN: 1274, STREAM(3)
1884 6.289233 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 621, ACK_ECN
1885 6.290249 10.0.3.1 10.0.1.1 QUIC  270 DCID=65cc97c40b1817d1, PKN: 1275, STREAM(3)
1886 6.290777 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1276, STREAM(3)
1887 6.291027 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 622, ACK_ECN
1888 6.291682 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1277, STREAM(3)
1889 6.292585 10.0.3.1 10.0.1.1 QUIC  458 DCID=65cc97c40b1817d1, PKN: 1278, STREAM(3)
1890 6.293295 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 623, ACK_ECN
1891 6.293490 10.0.3.1 10.0.1.1 QUIC 1248 DCID=4d405bcceea196bd, PKN: 1279, PC, PADDING
1892 6.295375 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 624, ACK_ECN
1893 6.295974 10.0.3.1 10.0.2.1 QUIC 1248 DCID=4d405bcceea196bd, PKN: 1280, PC, RC, PADDING
1894 6.297188 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 625, ACK_ECN
1895 6.298468 10.0.3.1 10.0.2.1 QUIC 1374 DCID=4d405bcceea196bd, PKN: 1281, PING, PADDING
1896 6.299029 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 626, ACK_ECN
1897 6.310328 10.0.2.1 10.0.3.1 QUIC   80 DCID=b97c2e5111f44552, PKN: 627, MSD(3)
1898 6.320945 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 628, ACK_ECN
1899 6.325686 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 629, ACK_ECN
1900 6.328957 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 630, ACK_ECN
1901 6.330771 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 631, ACK_ECN
1902 6.332201 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 632, ACK_ECN
1903 6.333996 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 633, ACK_ECN
1904 6.335832 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 634, ACK_ECN
1905 6.337617 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 635, ACK_ECN
1906 6.339041 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 636, ACK_ECN
1907 6.340860 10.0.2.1 10.0.3.1 QUIC   84 DCID=b97c2e5111f44552, PKN: 637, ACK_ECN
1908 6.342657 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 638, ACK_ECN
1909 6.343609 10.0.2.1 10.0.3.1 QUIC 1248 DCID=b97c2e5111f44552, PKN: 639, PR, PADDING
1910 6.346113 10.0.2.1 10.0.3.1 QUIC 1248 DCID=b97c2e5111f44552, PKN: 640, ACK_ECN, PR, PADDING
1911 6.346132 10.0.2.1 10.0.3.1 QUIC  102 DCID=b97c2e5111f44552, PKN: 641, NCI
1912 6.348709 10.0.2.1 10.0.3.1 QUIC   85 DCID=b97c2e5111f44552, PKN: 642, ACK_ECN
1913 6.371125 10.0.3.1 10.0.2.1 QUIC 1248 DCID=4d405bcceea196bd, PKN: 1282, ACK_ECN, PC, NCI, STREAM(3)
1914 6.393867 10.0.3.1 10.0.2.1 QUIC 1248 DCID=4d405bcceea196bd, PKN: 1283, ACK_ECN, PC, STREAM(3)

nemethf commented 2 months ago

I'd like to see stream frames as early as packet 1893. What should I do? Thank you.

If I disable the anti-amplification check, then the server sends stream frames much sooner. Since our client only sends ACKs (which results it a low amount of data), the only way to avoid the blocking of the anti-amplification check is to pre-validate the path. So we need to implement #1772 after all.

This is the packet log without the anti-amplification check:

1838  6.468867 10.0.2.1  10.0.3.1     78 DCID=12dbff08b31a8580, PKN: 606, PING, RC, PADDING
1839  6.469667 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 607, ACK_ECN
1840  6.481102 10.0.3.1  10.0.1.1   1033 DCID=0855e0b1943a5f40, PKN: 1245, ACK_ECN, STREAM(3)
1850  6.489553 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 608, ACK_ECN
1851  6.489968 10.0.3.1  10.0.1.1    270 DCID=0855e0b1943a5f40, PKN: 1255, STREAM(3)
1857  6.492712 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 609, ACK_ECN
1858  6.493145 10.0.3.1  10.0.1.1    270 DCID=0855e0b1943a5f40, PKN: 1261, STREAM(3)
1867  6.498707 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 610, ACK_ECN
1868  6.499395 10.0.3.1  10.0.1.1    270 DCID=0855e0b1943a5f40, PKN: 1270, STREAM(3)
1873  6.501991 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 611, ACK_ECN
1874  6.502030 10.0.3.1  10.0.1.1    458 DCID=0855e0b1943a5f40, PKN: 1275, STREAM(3)
1875  6.503009 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 612, ACK_ECN
1876  6.504135 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 613, ACK_ECN
1877  6.505150 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 614, ACK_ECN
1878  6.506714 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 615, ACK_ECN
1879  6.507650 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 616, ACK_ECN
1880  6.509155 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 617, ACK_ECN
1881  6.511331 10.0.2.1  10.0.3.1     85 DCID=12dbff08b31a8580, PKN: 618, ACK_ECN
1882  6.518992 10.0.3.1  10.0.1.1   1248 DCID=696eecfba4219c90, PKN: 1276, PC, PADDING
1883  6.518997 10.0.3.1  10.0.2.1   1248 DCID=696eecfba4219c90, PKN: 1277, PC, RC, PADDING
1884  6.520776 10.0.3.1  10.0.2.1   1374 DCID=696eecfba4219c90, PKN: 1278, PING, PADDING
1885  6.523512 10.0.3.1  10.0.2.1   1248 DCID=696eecfba4219c90, PKN: 1279, PC, NCI, PADDING
1886  6.525998 10.0.3.1  10.0.2.1   1248 DCID=696eecfba4219c90, PKN: 1280, ACK_ECN, PC, STREAM(3)

nemethf commented 2 months ago

the diff will be a bit more readable if you set #1800's branch as the target,

I haven't known that's possible. Thanks.

or just rebase onto main if the changes don't actually interact.

I'll probably go this way.

I'm wondering if it might be cleaner to use a single unchanging socket bound to a wildcard IP, and have each outgoing datagram set its source IP address explicitly. This would avoid the need for two separate receive paths, and allow connections to operate concurrently on multiple addresses on the same endpoint. It would break support for active migration on platforms which don't support IPV6_PKTINFO or equivalent, but most platforms do.

I'm OK with this.

But does it eliminate use-cases when the client does not want to receive traffic on a specific network interface due to security reasons? Hm. Maybe the client could filter incoming packets based on their destination address.

And what if the client wants to move from port A to port B? We used to test migration with moving from 127.0.0.1:A to 127.0.0.1:B. Although this might not be relevant in practice.

This would probably be a larger change, so perhaps best left for future work even if it sounds good.

If there are multiple socket to listen to, one can use the select libc call. It seems rust's selecting crate only support AsRawFd, which is not available on windows. libselect is cross-platform, but I'm not sure it is acceptable to use it in quinn. Or is there any other way currently in quinn to poll two sockets at once?

At any rate, maybe it is enough for this pull request to query the two sockets in sequence.

Ralith commented 2 months ago

But does it eliminate use-cases when the client does not want to receive traffic on a specific network interface due to security reasons?

Good question. This can be solved at the host firewall level, but people are accustomed to specifying a narrow listen address automatically getting you strong isolation without any extra effort, so this could be a footgun. Packets from unexpected peers do get dropped quickly, but this would arguably remove a layer of defense-in-depth.

And what if the client wants to move from port A to port B?

Another good question. It's not obvious to me if there's a use case here, but maybe if there's weird firewall rules, or the application is actively trying to break linkability? We're getting this at the cost of the capability to have different connections on different paths in the same endpoint, but an applications can always choose to have multiple endpoints.

It sounds like sticking with multiple sockets along the lines drafted here is the safer bet, even if it is a little ugly. We may want to revisit this for multipath in the future when concurrent outgoing paths are more important.

At any rate, maybe it is enough for this pull request to query the two sockets in sequence.

Correct. The underlying async runtime is responsible for notifying us when any operation that we got a Poll::Pending (morally EWOULDBLOCK) for on the previous Future::poll becomes newly ready.

nemethf commented 2 months ago

I force-pushed a new version, which eliminates the code duplication of the previous version as requested. Due to my limited rust knowledge I struggled with the refactorization, and the result is probably not idiomatic rust.

Just like upstream main version, it does not call self.recv_limiter.finish_cycle() in case of an error. Maybe there's a good reason for it.

djc commented 2 months ago

Still needs a rebase onto current main. If you're in it, can you change the order of the methods to have drive_recv() before poll_socket() -- I seem to recall it was like that at some point, but maybe that got lost in the shuffle.

nemethf commented 2 months ago

Still needs a rebase onto current main.

I've just done it. However, I just wasn't able to move iovs: [IoSliceMut; BATCH_SIZE] out of poll_socket() up to drive_recv(), or even to struct State. Is it theoretically possible to reuse iovs ?

If you're in it, can you change the order of the methods to have drive_recv() before poll_socket() -- I seem to recall it was like that at some point, but maybe that got lost in the shuffle.

Sorry, I don't know what happened, I seem to pushed an old version.

nemethf commented 2 months ago

What if the endpoint has multiple outgoing connections to multiple servers? Should we hold on to abandoned_socket until we've authenticated a post-migration packet from each live connection?

Unfortunately I haven't thought about multiple connections. If a connection is successfully migrated to the new socket, I think the client should not accept packets from the old socket for that connection to avoid potential malicious activity. So handling multiple outgoing connections will be a bit complex.

Instead of force-pushing a new version, I pushed separate commits to address your review points. I think it is easier to understand them this way. I can merge into one commit in the end if necessary.

As an additional improvement it would be good not to create iovs with each poll_socket() call, but I was not able to factor this out into PollState::new().

nemethf commented 2 months ago

I think rewriting history as follows, with roughly the same end state, would provide the greatest clarity to support review for correctness:
1. Introduce `PollState`, moving `drive_recv` onto it (plus a socket argument) without changing behavior

2. Reintroduce a method on `State` and move the logic to start/end the `recv_limiter` cycle into it

3. Introduce `abandoned_socket` and associated logic
If that's beyond what you're comfortable with/motivated for, let me know and I can have a go at it myself.

These steps sound logical, but I just cannot implement step 1 without step 2. So it would be great if you cloud take it over. Thanks.

Ralith commented 2 months ago

Split up the changes as described, and addressed my own outstanding feedback. I still need to do a final once-over for correctness, though.

Ralith commented 2 months ago

Did a little more readability cleanup and satisfied myself as far as correctness goes.

quinn-rs / quinn

Socket rebind: drain old socket #1801