My automotive system has *.fidl with ~3500 attributes, one per CAN signal. My *.fdepl maps each attribute into a unique EventGroup.
Any time the network connection is established, or broken and re-established, I get an avalanche of ~3500 subscribes, followed by ~3500 acknowledgements, transmitted one-per-frame. The entire sequence does not fit inside a 2 seconds Service Discovery interval. When the work does not complete within the timeout interval then routingmanager will issue StopSubscribe and SubscribeNAK. The system will retry but it will take a long time, at least a couple of Service Discovery intervals.
The train logic is supposed to aggregate these together, sending a train only when it’s full or 5 ms elapse, but there are several places in the code that prevent this.
Reproduction Steps
This behavior is easily reproduced when the system has a *.fidl with 1000s of attributes and *.fdepl puts each into a unique EventGroup.
Subscribe to all ~3500 attributes, use an ifconfig down; sleep 10; ifconfig up to break and re-establish the network connection, look at the tcpdump and observe the network behavior.
Expected behaviour
The train logic should do a "pretty good job" to aggregate many SUBSCRIBE and many SUBSCRIBEACK into each Service Discovery packet.
Logs and Screenshots
With the existing code you should see 1000s of back-to-back SUBSCRIBE like:
each of ~98 bytes, separate packets, nothing or almost-nothing aggregated. In this region we see a SUBSCRIBENACK and socket close because the entire sequence exceeded the 2s Service Discovery timeout interval
vSomeip Version
v3.4.10
Boost Version
1.82
Environment
Android and QNX
Describe the bug
My automotive system has
*.fidl
with ~3500 attributes, one per CAN signal. My*.fdepl
maps each attribute into a unique EventGroup.Any time the network connection is established, or broken and re-established, I get an avalanche of ~3500 subscribes, followed by ~3500 acknowledgements, transmitted one-per-frame. The entire sequence does not fit inside a 2 seconds Service Discovery interval. When the work does not complete within the timeout interval then routingmanager will issue StopSubscribe and SubscribeNAK. The system will retry but it will take a long time, at least a couple of Service Discovery intervals.
The train logic is supposed to aggregate these together, sending a train only when it’s full or 5 ms elapse, but there are several places in the code that prevent this.
Reproduction Steps
This behavior is easily reproduced when the system has a
*.fidl
with 1000s of attributes and*.fdepl
puts each into a unique EventGroup.Subscribe to all ~3500 attributes, use an
ifconfig down; sleep 10; ifconfig up
to break and re-establish the network connection, look at the tcpdump and observe the network behavior.Expected behaviour
The train logic should do a "pretty good job" to aggregate many SUBSCRIBE and many SUBSCRIBEACK into each Service Discovery packet.
Logs and Screenshots
With the existing code you should see 1000s of back-to-back SUBSCRIBE like:
each of ~98 bytes, separate packets, nothing or almost-nothing aggregated. In this region we see a SUBSCRIBENACK and socket close because the entire sequence exceeded the 2s Service Discovery timeout interval