@njgheorghita this is the fix for the 1MiB transfer failures you were seeing in the "gossip on delete" prototype.
Problem: transfers with single writes over ~1MB were hanging, then failing by timeout.
Basic approach:
First, I refactored the existing concurrency test so that we can pass in custom data values (like a single larger value)
Second, I added a test that sends a large value. I could trigger the failure with 1024^2 + 1 bytes, while one fewer byte passes the test. It seemed worth testing a much larger amount, so I went up to 50 MiB successfully. I ran into issues at 100MiB that might be related to a rollover at 2^16 packets, but punted on that for now.
Third, I fixed the bug (and passed the test) by splitting up pending writes when they don't fit in the buffer. This also affects how smaller pending writes work, because it will always fill up the send buffer now, but in practice we usually do a single write and close, so I think it won't come up (and I don't foresee a problem with the new behavior anyhow).
Unfortunately, there does appear to be some kind of performance hit when the transfer is much larger than the buffer. See the speed drop off massively in these 5 different tests:
2024-08-02T01:45:20.923482Z INFO socket: finished udp load test with 1 transfer of 1 MB, in 14.0ms, at a rate of 597.4 Mbps
2024-08-02T01:46:02.074239Z INFO socket: finished udp load test with 1 transfer of 2 MB, in 62.9ms, at a rate of 266.6 Mbps
2024-08-02T01:55:20.269157Z INFO socket: finished udp load test with 1 transfer of 4 MB, in 582.3ms, at a rate of 57.6 Mbps
2024-08-02T02:44:33.315410Z INFO socket: finished udp load test with 1 transfer of 34 MB, in 6.5s, at a rate of 41.3 Mbps
2024-08-02T02:45:45.387391Z INFO socket: finished udp load test with 1 transfer of 52 MB, in 22.6s, at a rate of 18.5 Mbps
I haven't had a chance to hunt down the cause. At the moment, I think I will just move on. We really need to get this bugfix out now, and I think there are more urgent things that need to be resolved before Devcon.
Note that concurrent transfers can still saturate UDP, at ~1Gbps:
2024-08-02T04:53:51.354249Z INFO socket: finished high concurrency load test of 1000 simultaneous transfers, in 7.72s, at a rate of 1036 Mbps
Unfortunately, there does appear to be some kind of performance hit when the transfer is much larger than the buffer. See the speed drop off massively in these 5 different tests:
@njgheorghita this is the fix for the 1MiB transfer failures you were seeing in the "gossip on delete" prototype.
Problem: transfers with single writes over ~1MB were hanging, then failing by timeout.
Basic approach:
Unfortunately, there does appear to be some kind of performance hit when the transfer is much larger than the buffer. See the speed drop off massively in these 5 different tests:
I haven't had a chance to hunt down the cause. At the moment, I think I will just move on. We really need to get this bugfix out now, and I think there are more urgent things that need to be resolved before Devcon.
Note that concurrent transfers can still saturate UDP, at ~1Gbps: