ArduPilot / ardupilot

ArduPlane, ArduCopter, ArduRover, ArduSub source
http://ardupilot.org/
GNU General Public License v3.0
10.76k stars 17.21k forks source link

Plane: FrSky telemetry works on copter, broken on plane #6274

Closed magicrub closed 4 years ago

magicrub commented 7 years ago

We're getting reports of FrySky telemetry working on copter but broken on plane. Tried on plane v3.7.x and v3.8 betas. Don't have any other details than that. I don't have an FrSky telemetry to test. @floaledm please test and report back.

floaledm commented 7 years ago

Thanks Tom. I will asap and report back.

magicrub commented 7 years ago

Thanks! Also, ping @badzz

robustini commented 7 years ago

Hi Tom, i use "FlightDeck" telemetry system in all my drones with the latest commit in the master, and in both i've the same problems, the telemetry work fine only when the system is disarmed, the data refresh in the Taranis display is good, but when takeoff and fly the battery voltage and other freeze in the latest state, the data update is horrible, only the artificial horizon and little else continues to work, though with very little refresh.

http://www.craftandtheoryllc.com/flightdeck-taranis-opentx-ardupilot-arducopter-pixhawk-2-cube-servo-frsky-x9d-x7-q-x7-qx7-telemetry-smartport-smart-port-serial

magicrub commented 7 years ago

Ouch. Do you know if this was a recent bug or has it always (not) worked this way? Is it still there in plane 3.7?

On May 20, 2017 12:24 AM, "Marco Robustini" notifications@github.com wrote:

Hi Tom, i use "FlightDeck" telemetry system in all my drones with the latest commit in the master, and in both i've the same problems, the telemetry work fine only when the system is disarmed, the data refresh in the Taranis display is good, but when takeoff and fly the battery voltage and other freeze in the latest state, the data update is horrible.

http://www.craftandtheoryllc.com/flightdeck-taranis-opentx- ardupilot-arducopter-pixhawk-2-cube-servo-frsky-x9d-x7-q- x7-qx7-telemetry-smartport-smart-port-serial

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/ArduPilot/ardupilot/issues/6274#issuecomment-302856722, or mute the thread https://github.com/notifications/unsubscribe-auth/AEj7G5FWWnV-Rf4cht6C4IeHFCfPaXDnks5r7pU2gaJpZM4Neukr .

floaledm commented 7 years ago

Was working on a major release so apologies for not getting back to this sooner...

Just tested 3.8.0beta5 and the "latest" plane (Mon May 22 12:19:38) versions and when armed or disarmed, both the SPort (4) and Passthrough (10) protocols work fine on my end...

We've had reports from users that the FrSky telemetry link is "slowed/inoperable" (which actually means that the autopilot does not respond to polling requests form the RX within the allotted time) when:

May not be that relevant here but there's been discussions on the ArduPilot forum whereas EKF3 was causing issues on Copter, but this is reported as fixed as of 3.5.0-rc5.

floaledm commented 6 years ago

I'm confirming that with slow or corrupted microSD card that FrSky telemetry on the Pixhawk goes down (typically after arming because this is when default logging starts) but that invariable reformatting or replacing with a decent generic microSD fixes the problem. See https://discuss.ardupilot.org/t/solved-bug-with-frsky-native-telemetry-arming-disable-it/21789/3.

Other than that, I haven't had reports of it not working on either plane or copter, and I've just tested on the latest releases of both...

So, in my view, this issue could be closed.. or at least rebranded as an issue about CPU load being too high when attempting to log onto a faulty microSD...

magicrub commented 6 years ago

@robustini can you re-confirm this is still a problem on v3.8.x? Please and try and help out @floaledm so he can reproduce your problem so we can get to the bottom of this.

robustini commented 6 years ago

@magicrub I can confrim Tom, the problem is related to the LOG_BITMASK, with a fairly simple log the Fr-Sky telemetry works fine, when logging many data instead stops working. So as @floaledm writes the problem is tied to an high log writing process.

floaledm commented 6 years ago

Thanks for reporting back. In addition to changing LOG_BITMASK, we have multiple users who have resolved this by changing to a "better/faster" (but not necessarily high-perf or branded) SD card.

Is anything else known to be affected when there's a high CPU load due to logging being too slow? Maybe the move to ChibiOS will somehow help with that?

On a somewhat related side note, hopefully serial interrupts can also be introduced at some point to avoid having to periodically monitor serial port for incoming data...

magicrub commented 6 years ago

So what is the root cause? SDmicro buffers overflowing? How does that impact the FrSky stuff? It should be a totally different thread and buffer.

slow or corrupted microSD card that FrSky telemetry on the Pixhawk goes down But the other serial ports don't go down? What's different?

floaledm commented 6 years ago

What's different with that serial protocol is that the FrSky receiver repeatedly sends poll requests and ArduPilot essentially has 4ms to respond. The way I understand it is that a CPU overload causes ArduPilot to be too slow to respond within the 4ms. When ArduPilot takes too long to respond, no telemetry ends up getting sent back from the RX to the TX/RC controller.

The CPU overload is caused by SD card logging taking too long...

magicrub commented 6 years ago

Well, having a buffer size of zero can't be good.. https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_SerialManager/AP_SerialManager.h#L45

magicrub commented 6 years ago

whoops, nevermind. Looks like every single one of those defaults are overwritten inside the driver here with useful numbers: https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_HAL_PX4/UARTDriver.cpp#L49

floaledm commented 6 years ago

Thanks for looking into this. Oh, and the other difference is that the FrSky telemetry task ("tick") is not managed by the vehicle scheduler like with the other serial ports but instead using this way (which I understand is done at 1kHz): https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_Frsky_Telem/AP_Frsky_Telem.cpp#L65

That was the way the maintainer did it at the time on advice from Tridge. The reason it was setup that way was so that it could run at a high freq, again because there's only 4ms to respond to a poll request.

magicrub commented 6 years ago

yeah, I noticed that. Well, latency will get better with ChibiOS but it won't do miracles. Perhaps this driver needs to be redesigned?

floaledm commented 6 years ago

It could but the problem again is with the logging to the SD card. No one may have reported a problem with that but it should be cause for concern to have a task (the logging) that potentially causes very high CPU load...

auturgy commented 6 years ago

@peterbarker

peterbarker commented 6 years ago

On Mon, 9 Oct 2017, Florent Martel wrote:

The CPU overload is caused by SD card logging taking too long...

Are you on the IO thread?

floaledm commented 6 years ago

@peterbarker Not sure... it uses hal.scheduler->register_io_process() https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_Frsky_Telem/AP_Frsky_Telem.cpp#L65

peterbarker commented 6 years ago

On Sun, 26 Nov 2017, Florent Martel wrote:

@peterbarker Not sure... it uses hal.scheduler->register_io_process() https://github.com/ArduPilot/ardupilot/blob/master/libraries/AP_Frsky_Telem /AP_Frsky_Telem.cpp#L65

Right. That thread exists to handle IO interactions that may have extreme latency - think writing to an SD card and that SD card going out to lunch for half-a-second. It's about the worst place in the code you could possibly hook in to get low latency!

Sounds like another thread would be a good idea here.

floaledm commented 6 years ago

@peterbarker That explains a lot. This is how it's been since the FrSky lib was created. I could use your guidance on what would be a suitable alternative (there's a 4ms window to respond to polling requests from the receiver)...

palm369 commented 6 years ago

Did some ground tests on AP3.8.3 with Florents LUA script (serial_protocol = 10) image

using this cards: 2017-11-28 18 23 53

LOG_FILE_BUFSIZE = 16 Had little gaps even while using the fast SD card: inkedscreenshot 2017-11-28 18 10 48_li

magicrub commented 6 years ago

Does this imply there's a CPU/IOthread resource limitation? Doesn't Copter have a higher cpu utilization? if so, why would this happen on Plane but not copter?

palm369 commented 6 years ago

Just tested AC3.5.4 (stock settings) with the slow card and serial_protocol = 10 and it was working, but very slowly updated while logging. With the faster card it was better and just sometimes had little very short dropouts (about 0.5s).

palm369 commented 6 years ago

This are the speeds of the cards I tested with: Slow one: afgaphoto 4gb Fast one: sandisk 32gb

magicrub commented 6 years ago

really?!?!?! Small writes are <= 5kB/s!?!?!? Don't think so...

palm369 commented 6 years ago

Was also wondering whats going on. Done the test two times to be sure... similar result!

kantlivelong commented 6 years ago

I'm seeing this behavior on Copter 3.5.5.

magicrub commented 5 years ago

can we please check this again using a ChibiOS build

floaledm commented 5 years ago

Thanks Tom for your followup. I don't have means to replicate the problem (e.g., using an SD card that had the issue before) so can't give you more of an answer than it's working for now on my end with ChibiOS.

tridge commented 5 years ago

what we need to do is move the sdcard logging to its own thread. It should not share with other devices like frsky

IamPete1 commented 5 years ago

I can confirm this is still a issue on Rover master

gumisb commented 5 years ago

I can confirm that on ChibiOS Plane last stable (hw:matek wing f405) with logging enabled after arming, frsky telemetry passthru (10) stop working. Logging starts after arming. With logging disabled all work as should (log mask =0)

IamPete1 commented 5 years ago

Just had a quite brief look at this, cant see any obvious differences between how it is setup on Copter vs anything else, yet seemingly it works fine on copter and lags on everything else with heavy logging? Maybe this is not the case, I have not done any back to back tests on identical hardware. Or it is quite possible I missed some difference.

The consensus of this issue seems to be that logging should moved into its own thread to free up the I-O thread?

yaapu commented 5 years ago

@Tridge created PR https://github.com/ArduPilot/ardupilot/pull/12212 which should finally close this

tridge commented 5 years ago

12212 is now merged, so testing of master to confirm this issue is fixed would be appreciated

rmackay9 commented 4 years ago

My guess is this has been fixed now and this can be closed..