In current Tcp::send(), we systematically do the following code to limit the number of write system calls:
auto sWithLength = oStream.str();
sWithLength.append(msg);
boost::asio::write(*rank2sock[r], boost::asio::buffer(sWithLength.data(), sWithLength.length()));
This is suboptimal when size of msg is big (over 10'000 bytes?) as sWithLength.append(msg); allocates a new big chunk of memory and does a copy from msg to sWithLength. So we propose the following modification:
Note: The difference of performance has been observed during an experiment made with 2 participants on a 1 Gbps network (python3 launch_fbae.py /netfs/inf/simatic/FBAE/ /netfs/inf/simatic/FBAE/results/ -a B -c t -f 8000 -n 40000 -s 8192 -S /netfs/inf/simatic/FBAE/sites_2_b313.json -w 10 -m 1000000):
With 1 write system call, throughput was 1106 Mbps.
With 2 write system calls, throughput was 1178 Mbps, i.e. an imporvement of +6.5%.
In current Tcp::send(), we systematically do the following code to limit the number of
write
system calls:This is suboptimal when size of
msg
is big (over 10'000 bytes?) assWithLength.append(msg);
allocates a new big chunk of memory and does a copy frommsg
tosWithLength
. So we propose the following modification:Note: The difference of performance has been observed during an experiment made with 2 participants on a 1 Gbps network (
python3 launch_fbae.py /netfs/inf/simatic/FBAE/ /netfs/inf/simatic/FBAE/results/ -a B -c t -f 8000 -n 40000 -s 8192 -S /netfs/inf/simatic/FBAE/sites_2_b313.json -w 10 -m 1000000
):write
system call, throughput was 1106 Mbps.write
system calls, throughput was 1178 Mbps, i.e. an imporvement of +6.5%.