Describe the bug
When a slow or somewhat "full" telemetry link is used. The mission upload (download from Px4 perspective), can cause an excessive growth of mission requests, making the problem even worse!
What happens:
When an upload (e.g. from QGC) is triggered it wil send a mission count to PX4 to indicate how many mission elements there are.
PX4 will start requesting element 1 (and starts a timeout trigger)
QGC send element 1
timeout occurred on receiving element 1 (timeout occurred) -> re-request element 1
QGC sends element 1 again (let's call it element 1')
element 1 is received by PX4, now request element 2.
QGC sends element 2
element 1' is received by PX4, "element 1 is not expected", therefor send request for element 2.
QGC sends element 2'
initial element 2 request timed out, re-request element 2.
QGC sends element 2''
Element 2 is received by PX4, now request element 3.
QGC sends element 3
element 2' is received by PX4, "element 1 is not expected", therefor send request for element 3.
QGC sends element 3'
et... etc...
Due to the significant increase of requests (and therefor answers of QGC) the telemetry link gets polluted and flooded with messages making the problem only worse!
To Reproduce
This is difficult to reproduce if you don't have a telemetry modem that causes this level of delay.
I guess (haven't tried it) it should also be possible to reproduce this by making the amount of used bandwidth higher than the available bandwidth (e.g. by increasing the send rate of some of the larger Mavlink Messages).
This should also cause the timeout to be triggered quicker.
Expected behavior
I would expect PX4 not to send requests when an unexpected element is received at all, or at least not for elements that have already been received!
The timeout and re-request behavior (as designed per Mavlink) can cause multiple instances of the same element to be received, therefor the need for re-requesting an element whenever a received element is not equal to the one expected is IMHO wrong, since the a timeout will always re-request when it is not received in time...
Log Files and ScreenshotsAlways provide a link to the flight log file:
Unfortunately I haven't been able to create a decent logfile that contains either the debug or the Mavlink messages.
I do have a screenshot showing an upload of a flightplan containing 100 waypoints. At the moment of taking it PX4 was requesting waypoint 93, but it had been running for about 20 minutes already and had been sending the mission_request message for over 22.000 time!!!
At the end it was requesting a single WP a few thousand times each!!!
Drone:
Although unrelated:
We are using VTOL drone and running PX4 on a Cube Orange.
What might be of interest:
Radio:
We use RFdesign rfd900x radios with a EU specific configuration (which is using a 10% duty cycle at 200mb/s over the air speed). We have Hardware flow control (forced) enabled.
Describe the bug When a slow or somewhat "full" telemetry link is used. The mission upload (download from Px4 perspective), can cause an excessive growth of mission requests, making the problem even worse!
What happens:
Due to the significant increase of requests (and therefor answers of QGC) the telemetry link gets polluted and flooded with messages making the problem only worse!
To Reproduce This is difficult to reproduce if you don't have a telemetry modem that causes this level of delay. I guess (haven't tried it) it should also be possible to reproduce this by making the amount of used bandwidth higher than the available bandwidth (e.g. by increasing the send rate of some of the larger Mavlink Messages). This should also cause the timeout to be triggered quicker.
Expected behavior I would expect PX4 not to send requests when an unexpected element is received at all, or at least not for elements that have already been received! The timeout and re-request behavior (as designed per Mavlink) can cause multiple instances of the same element to be received, therefor the need for re-requesting an element whenever a received element is not equal to the one expected is IMHO wrong, since the a timeout will always re-request when it is not received in time...
Log Files and Screenshots Always provide a link to the flight log file: Unfortunately I haven't been able to create a decent logfile that contains either the debug or the Mavlink messages. I do have a screenshot showing an upload of a flightplan containing 100 waypoints. At the moment of taking it PX4 was requesting waypoint 93, but it had been running for about 20 minutes already and had been sending the mission_request message for over 22.000 time!!! At the end it was requesting a single WP a few thousand times each!!!
Drone: Although unrelated: We are using VTOL drone and running PX4 on a Cube Orange.
What might be of interest: Radio: We use RFdesign rfd900x radios with a EU specific configuration (which is using a 10% duty cycle at 200mb/s over the air speed). We have Hardware flow control (forced) enabled.