1.14.1 adds multi-threading support, which after adjustment, improves performance from 1.5 to 3 / 5 (unencrypted) Gbits/sec in the same environment 🎉
Test environment: PVE; Intel Xeon Gold 6330; Linux x64;
Client: 4c4t 8G; ZeroTier 1.14.1
Server: 16c16t 16G; ZeroTier 1.14.1
Client and server connected directly via PVE vmbr, not a bottleneck. iperf3 code test speed results are 30 Gbits/sec
iperf3 speed test, 8 or more TCP streams, lasting 30s.
With multi-threading disabled on both client and server, performance is around 1.5 Gbits/sec.
With multi-threading enabled and "trustedPathId" configured for no encryption (optional):
In the test scenario, the decryption thread is fully loaded, so performance cannot stack. Below are flame graphs of this thread:
A straightforward idea is to add more decryption threads to resolve this bottleneck.
Conclusion
It can be verified that the addition of multi-threading in 1.14.1 has resulted in a 2x performance improvement for ZeroTier. Thanks to the developers for their work.
Requires Linux platform and multi-threading enabled on both sides.
Based on preliminary analysis, the bottleneck in 1.14.1's multi-threading appears to be the single decryption thread. Perhaps adding more decryption threads could be a direct improvement.
1.14.1 adds multi-threading support, which after adjustment, improves performance from 1.5 to 3 / 5 (unencrypted) Gbits/sec in the same environment 🎉
Test environment: PVE; Intel Xeon Gold 6330; Linux x64;
iperf3 speed test, 8 or more TCP streams, lasting 30s.
With multi-threading disabled on both client and server, performance is around 1.5 Gbits/sec.
With multi-threading enabled and "trustedPathId" configured for no encryption (optional):
Client configuration:
Server configuration:
Bandwidth reaches 3 / 5 (unencrypted) Gbits/sec 🎉
1.14.1 Performance Bottleneck
Performance bottleneck analysis may contain errors. Please point out any inaccuracies, thank you very much.
int bucket = flowId % _concurrency;
always results in only one thread executing.A straightforward idea is to add more decryption threads to resolve this bottleneck.
Conclusion
It can be verified that the addition of multi-threading in 1.14.1 has resulted in a 2x performance improvement for ZeroTier. Thanks to the developers for their work.
Based on preliminary analysis, the bottleneck in 1.14.1's multi-threading appears to be the single decryption thread. Perhaps adding more decryption threads could be a direct improvement.