nanomsg / nng

nanomsg-next-generation -- light-weight brokerless messaging
https://nng.nanomsg.org
MIT License
3.63k stars 472 forks source link

Websocket masking could a lot faster #1801

Open gdamore opened 3 months ago

gdamore commented 3 months ago

The current masking code is fairly naive and masks a byte at a time. But pretty much everyone has 64-bit operations (and some even have 128-bit SIMD operations).

It's possible that optimizers are good enough to mask this efficiently already, but from what I observe on godbolt, even under -O3 both clang and gcc don't optimize this well - at least for x86.

The interesting concerns here will be dealing with misaligned data, which is not an issue on x86, and for modern ARM (aarch64) is usually not an issue.

Fixing this would be substantial for high bandwidth Websocket messages.