microsoft / ntttcp-for-linux

A Linux network throughput multiple-thread benchmark tool.
https://github.com/Microsoft/ntttcp-for-linux
MIT License
361 stars 87 forks source link

Remove structure slops to reduce cache pressure #42

Open lpereira opened 4 years ago

lpereira commented 4 years ago

Many shared structs are ordered in a way that, when built on a LP64 system, will have unused bytes due to alignment. These structs may be reduced by carefully reordering them, and maybe changing certain types of members to something that packs better.

For instance, struct ntttcp_stream_server has a sizeof 144, spanning over 3 cache lines. According to pahole, it could be at least 129, which is 5 bytes over using just 2 cache lines. By changing all the bool elements to a single integer holding flags, this would fit 2 cache lines.

simonxiaoss commented 4 years ago

Yes, good catch. The alignment is really something we should fix.