Open joeyoravec opened 1 month ago
@joeyoravec is this ready to review or are you still working on this?
@joeyoravec is this ready to review or are you still working on this?
I've left this in draft so it's clear that this is not ready for merge. Yes this "works" and is worth reviewing.
fixes COVESA/vsomeip#669
As described in issue #669 the train logic should aggregate Service Discovery, but does not currently. This leads to very low performance on systems offering 1000s of EventGroups.
These changes improve performance on 3.4.x branch. I'm leaving this PR in draft status to gather feedback.
Allow passengers with same identifier aboard same train
File
implementation/endpoints/src/server_endpoint_impl.cpp
has a test:this test will “hit” for every single Service Discovery message: Subscribe, SubscribeAck, etc. Each has the same service/identifier, forcing the train to depart immediately, and preventing the train logic from aggregating multiple SD passengers into the same train.
I've eliminated this entire check on my system. The only rationale I can think to keep this logic is debouncing, to have exactly 1 passenger in each train with the "final, debounced EventGroup value". I don't think the code implements this. If this was a design goal then I think it would need to be handled higher in the stack on an EventGroup-by-EventGroup basis, not as a calculation of train departure time.
Eliminate debounce logic
Debouncing would make sense if it was applied on a message-by-message basis so each gets transmitted “no more often than Xms” dropping all updates except the final one. This way vsomeip events could get low-pass filtered like CAN periodic: some transmitting 20ms, others transmitting 500ms.
The calculation in
implementation/endpoints/src/server_endpoint_impl.cpp
looks wrong because it applies this parameter to decide when an entire multi-passenger train should depart.Apply retention time for Service Discovery
There is conditional logic in both 3.1.20 and 3.4.10 in
implementation/endpoints/src/server_endpoint_impl.cpp
to “always use zero” and skip configured retention time for service discovery:I've elimianted this override so we can use the configured 5ms for service discovery. The only rationale to keep the original behavior is to minimize latency and respond immediately without waiting. In my case it makes more sense to wait and aggregate because there is really a lot of traffic.
Current performance
With all of these changes together, my Service Discovery is taking about 300ms end-to-end down from >>2s (usually even exceeding a single SD interval and needing to retry). In this example:
there’s nominal 200-400 microseconds gap between packets, sometimes back-to-back and sometimes up to 2ms. Significantly better system load because it’s aggregating packets together into fewer network send calls.