We're seeing a latency of up to 2.5ms between the send and receive time for a C++ publisher and Java subscriber on loopback. How can we decrease the latency? The implementation details follow.
The structure definition (Packet.lcm):
struct Packet {
int32_t id;
}
The C++ publisher code (PacketPublisher.cpp), where rate is 7500 (7.5ms):
bool PacketPublisher::init() {
return mMessageBus.good();
}
void PacketPublisher::run( uint32_t rate ) {
// Publish packets at a given rate.
while( true ) {
mMessageBus.publish( "Phy_Rx_Packet", &mPacket );
// Uniquely identify the packet.
mPacket.id++;
// Wait for the given publish interval to expire.
std::this_thread::sleep_for( std::chrono::microseconds( rate ) );
}
}
The Java subscriber code (PacketSubscriber.java):
class PacketSubscriber implements LCMSubscriber {
private static String CHANNEL_ID = "Phy_Rx_Packet";
private double mAverage = 0;
private long mPacketCount;
private Instant mThen = now();
PacketSubscriber() throws IOException {
var messageBus = new LCM();
messageBus.subscribe(CHANNEL_ID, this);
}
@Override
public void messageReceived(LCM lcm, String channel, LCMDataInputStream in) {
if (CHANNEL_ID.equals(channel)) {
try {
var packet = new Packet(in);
var delta = Duration.between(mThen, now());
mAverage = (delta.toNanos() + mAverage) / 2;
if (mPacketCount == 0) {
System.out.printf("Average arrival (ms) %f [id: %d]%n", (mAverage / 1_000_000), packet.id);
}
// 133 * 7.5 ms ~= 1 second
mPacketCount = (mPacketCount + 1) % 133;
mThen = now();
} catch (IOException e) {
throw new RuntimeException(e);
}
}
}
}
The timing for the first result in the following sample output is expected because we invoked both publisher and subscriber by hand, not scripted; the 10.05 could be considered noise from warm-up, so it's also not a concern. What is a concern is that, by and large, these simple 32-bit packets are received by Java at intervals of around 8.5ms per packet, which implies ~1ms one-way network latency on localhost. We thought the subscriber would receive them with perhaps .25ms latency on average.
Average arrival (ms) 1779.570000 [id: 0]
Average arrival (ms) 8.254820 [id: 133]
Average arrival (ms) 9.793491 [id: 266]
Average arrival (ms) 10.056988 [id: 399]
Average arrival (ms) 8.892480 [id: 532]
Average arrival (ms) 8.622321 [id: 665]
Average arrival (ms) 8.086657 [id: 798]
Average arrival (ms) 7.825660 [id: 3591]
Average arrival (ms) 9.998436 [id: 4123]
Average arrival (ms) 9.668022 [id: 15029]
We ran async-profiler on the code to see if there are any obvious bottlenecks. A lot of time is spent in the kernel ([k]), finishing up a task switch, which is probably outside the control of LCM:
We're seeing a latency of up to 2.5ms between the send and receive time for a C++ publisher and Java subscriber on loopback. How can we decrease the latency? The implementation details follow.
The structure definition (
Packet.lcm
):The C++ publisher code (
PacketPublisher.cpp
), where rate is 7500 (7.5ms):The Java subscriber code (
PacketSubscriber.java
):The timing for the first result in the following sample output is expected because we invoked both publisher and subscriber by hand, not scripted; the 10.05 could be considered noise from warm-up, so it's also not a concern. What is a concern is that, by and large, these simple 32-bit packets are received by Java at intervals of around 8.5ms per packet, which implies ~1ms one-way network latency on localhost. We thought the subscriber would receive them with perhaps .25ms latency on average.
We ran async-profiler on the code to see if there are any obvious bottlenecks. A lot of time is spent in the kernel (
[k]
), finishing up a task switch, which is probably outside the control of LCM:The kernel task switch has the following call stack:
The environment is set up as follows:
What would you advise tweaking to improve performance? (Network settings, garbage collector, compiler options, CPU affinity, etc.)