Open cBournhonesque opened 3 weeks ago
Did you actually run their benchmarks or did you just trust their results? :p
Ran their benchmarks :) I'm trying to understand why the difference can be so big. Probably because of extra allocations? but still
First optimization I will try:
EntityActionsMessage
directly
Entity
multiple times (once in Spawn, once in Removals, etc.)Note that these new approaches also probably require more bandwidth, because bitcode
was able to use bit-compression (write individual bits instead of bytes) for the previous EntityActionsMessage
.
I will try b.
first to have the biggest difference in performance
imo bandwidth is still the most important thing to optimize for multiplayer games. If its impossible to optimize the serialization due to an increase in bandwidth, having this tradeoff seems worth it. Ofc finding a balance is a good idea but i feel like optimizing for bandwidth first makes more sense.
Agreed that bandwidth is more important but this is still a massive difference: 40X ! I would accept something like 5 or 10X (i.e. 300us to serialize the 1000 entities), but 1.3ms is way too much
I think it's mostly due to 2 things:
Do you know how well lightyear compares to them in terms of avg bandwidth usage? Might be interesting to have benchmarks for that too..
Apart from that 1) sounds like a massive pain to resolve and would probably require huge internal changes as far as i can tell 2) what is your definition of "easier" serialization?
Read
/Write
traits, and doesn't use bit-level stuffAs a simple first step, replicating everything as a single ReplicationGroup
brings the time to around 780us
, which is a huge improvement. Probably because we don't allocate new space in the hashmaps, and we the vec allocations are more efficient.
Also instead instead of sending lots of small messages we send one big message, which made be more efficient for channel internals.
The full trace with all log spans is 1.9ms
, with:
buffer_send_with_priority
If I remove the prepare_entity_spawn
and prepare_component_insert
tracing, I get 1ms
with:
handle_replicating_add
is 190us (system 40us, commands is 150us)ServerConnection::send()
? Why is it taking so long?). Also replicon doesn't seem to benchmark the renet packet building part.So should we get rid of ReplicationGroups?
Benchmarks show that it takes 1.3 ms to replicate 1000 entities (replicon takes 30us). Why?
With a lot of tracing spans, it's 3ms (because of the tracing overhead):
send_entity_spawn takes 530us
send_component_update is 686us
networking::send is 1.13ms
Also here are the
ChannelSendStats
:Potential ideas:
send_entity_spawn
networking_send