ArduPilot / ardupilot

ArduPlane, ArduCopter, ArduRover, ArduSub source
http://ardupilot.org/
GNU General Public License v3.0
10.95k stars 17.47k forks source link

Copter: 4.0.5, 4.0.6 KakuteF7 FC disconnect in dronekit and mavproxy caused by DMA_PRIORITY #16415

Open RossHacquebard opened 3 years ago

RossHacquebard commented 3 years ago

Hello,

I'm reporting an issue I experienced, hopefully it will help someone out, or a fix can be included.

Bug report

Issue details I was experiencing a problem with 4.0.5 ArduCopter firmware (and 4.0.6 beta) while using a KakuteF7 flight controller connected to a companion computer using SERIAL_1. From this computer I could initially connect to the FC using dronekit (python) or mavproxy from the terminal. However, after a minute or so (around the same time the GPS acquires a 3D fix), the FC disconnects. The "wait_ready experienced a time out of 30 seconds" message appears, and mavproxy says "no link". We are using two u-blox GPS modules connected to SERIAL_3 and SERIAL_4. Running SITL has no problems, and running this firmware on a different board (KakuteF4) had no problems.

After debugging, I found this issue is related to the DMA_PRIORITY parameter for the KakuteF7 hwdef. Reverting this specific commit fixed my issue: https://github.com/ArduPilot/ardupilot/commit/ce8b57e402fbf8fb5b10b125cc1929b981954177 edit: Making this change caused the drone to sporadically drop out the sky during light. Not recommended.

which has the following change: DMA_PRIORITY S changed to: DMA_PRIORITY TIM1 TIM3 SPI4 SPI1*

Version 4.0.5 ArduCopter firmware (and 4.0.6 beta)

Platform [ ] All [ ] AntennaTracker [ X ] Copter [ ] Plane [ ] Rover [ ] Submarine

rmackay9 commented 3 years ago

That's not good. FYI @chobitsfan @andyp1per.

If you're OK testing with "latest" I'd be interested in hearing if 4.1.0-DEV suffers from the same problem.

chobitsfan commented 3 years ago

Thank you @RossHacquebard @rmackay9

ce8b57e is trying to fix #9861. which cause copters dropping out of the sky due to motor glitch

But this causes us to lose DMA for receive on Telem1 (USART1). I can not find better arrangement due to DMA controller and motor pinout.

I use UART2 for telem instead. Maybe I could add a note in kakutef7/aio/mini wiki doc, to suggest user to uart2 instead of uart1.

andyp1per commented 3 years ago

The KakuteF7Mini arrangement in 4.1-dev also support bi-directional dshot with some additional DMA remapping - its really hard to give everything the DMA it wants.

andyp1per commented 3 years ago

I think on that board I made sure SERIAL1 and SERIAL6 (RCIN) have DMA

RossHacquebard commented 3 years ago

I'm happy to test 4.1.0-DEV, Is there a specific branch I can fetch?

rmackay9 commented 3 years ago

@RossHacquebard, Thanks! if you use the MP, just press Ctrl-Q from the Install Firmware screen and the label under the icons should change to "xx 4.1.0-DEV".

RossHacquebard commented 3 years ago

That's a fun Mission Planner easter egg! I tried with Copter 4.1.0-DEV on a KakuteF7, and the results are the same.. the FC disconnects.

chobitsfan commented 3 years ago

Hi @RossHacquebard Could you be able to test using UART2 instead of UART1? Thank you very much

RossHacquebard commented 3 years ago

Hi, @chobitsfan Sorry about the delay, but I was able to test this issue with the companion computer connected to UART2 instead of UART1. It appears to be working correctly, and I don't get the disconnect issue anymore. Thanks for your suggestion!

We are now using all available UARTs on this board except UART1. Hopefully that doesn't cause memory access issues. Why does the DMA controller prioritise UART2 in this case?

Some more information about this: I mentioned that changing the hwdef back to **DMA_PRIORITY S*** fixed this issue. However I wouldn't recommend it. Although It does prevent disconnect, this change caused our drones to sporadically drop out of the sky while flying. The flight controller was appearing to reboot, and the ESCs would make the startup tone while falling. I think it's the same as: #9861

I'll do a flight test soon to confirm everything else is running properly.

andyp1per commented 3 years ago

@RossHacquebard it's possible that my DMA changes in 4.1-dev will allow you to safely change the hwdef back to DMA_PRIORITY S* without drones falling out of the sky - I haven't tried this so Caveat Emptor.

RossHacquebard commented 3 years ago

Another update: today we tried the unmodified v4.0.7 release, with our companion computer connected to UART2, and two gps modules on UART3/UART4. We are still experiencing the FC shutting down mid flight and dropping from the sky.

We also tried v4.0.6 and had the same issue.

I saw in this post a potential solution to set NODMA on all UART. We will try this next. Is this expected for using KakuteF7? Any other things to try here?

Thanks

andyp1per commented 3 years ago

@RossHacquebard the dropping out of the sky problem will be because the TIMx_UP channel, required for dshot, is being shared with other devices - and it is their usage of the DMA channel that is preventing dshot packets getting through in a timely fashion. The UP channels you need are TIM1_UP and TIM3_UP, so you could give dshot exclusive access to these like this:

DMA_NOSHARE TIM1_UP TIM3_UP

however this will deprive other devices (like UARTs) of these DMA channels. The way to check what you have lost is look at build/KakuteF7/hwdef.h and see whether you have anything listed as unable to allocate DMA channel. If you do, then see whether you care. Getting to a good combination is hard and I have spent a lot of time on this in 4.1 with the KakuteF7Mini hwdef.dat - you could check that for an alternative solution.

RossHacquebard commented 3 years ago

Thanks for your response, @andyp1per. I understand better now that each user may have different DMA priorities depending on their build. In v4.0.7 I see the default priority is (for KakuteF7):

DMA_PRIORITY TIM1 TIM3 SPI4 SPI1

and the spi devices listed as:

SPIDEV mpu6000  SPI4 DEVID1 ICM20689_CS MODE3  1*MHZ  4*MHZ
SPIDEV sdcard   SPI1 DEVID1 SDCARD_CS   MODE0 400*KHZ 25*MHZ
SPIDEV osd      SPI2 DEVID4 MAX7456_CS  MODE0 10*MHZ 10*MHZ

For us, motors (TIM1, TIM3), companion computer (PD5 which is USART2), IMU (SPI4), and other sensors for achieving stable flight are highest priority. But I see the sdcard (SPI1) is in the priority list. If something is marked NODMA, for example the sdcard device, will it fail completely or just save logs a bit slower?

Using Copter v3.6.x doesn't have this issue, but the only differences I see is SERIAL_ORDER and DMA_PRIORITY S*. Rolling back these changes on 4.0.x still shows the issue.

andyp1per commented 3 years ago

The only things you can disable DMA on safely are the uarts and pwm outputs - but if you do the former go slower and the latter cannot do dshot or leds.

SD card won't work if you disable DMA. Note that the priority field means "allocate my DMA first", it does not mean exclusive access. DMA_NOSHARE means exclusive access.

chobitsfan commented 3 years ago

Hi @RossHacquebard Did you use hexacopter? Thank you

RossHacquebard commented 3 years ago

No, we are using quadcopter. So perhaps disabling motor 5 and 6 DMA will help here?

We will test first with nodma on uart

andyp1per commented 3 years ago

@RossHacquebard I looked at the hwdef.dat. There are no conflicts with TIM1_UP/TIM3_UP and USART2 already has DMA assigned. The only thing I can see is that its TX is sharing a channel with TIM5_UP which is PWM6. Try commenting out PWM6 to see if that helps. If it doesn't then I wonder whether something else is going on.

chobitsfan commented 3 years ago

I summarize my testing result

version           used serial port          result
4.0.7             serial2 for telem         flight ok, telem ok
master            serial2 for telem         flight ok, telem ok  
RossHacquebard commented 3 years ago

The original issue, companion computer disconnecting when acquiring 3D fix, was fixed by switching to UART2.

Thank you for the information about memory channels, however I believe there is a separate issue going on here. We tested with commenting out PWM6, as well as assigning NODMA on all UARTs, and still got the shutting down issue. We were able to reproduce this issue on v3.6.9 firmware, and also when switching to a KakuteF4, which made me suspicious.

Motor shutdown ESC issue, march 30 2021

Here's a log showing the shutdown issue. At the time of motor shutdown, the ESC fails to report anything e.g. temperature, voltage (blue, yellow on the chart). However, the flight controller has a consistent voltage the whole time (green). To give a bit more detail, we are using the Tekko32F3 Metal 4in1 ESC(65A) and a 6S battery. I noticed some similar models of this ESC had a recall due to desync issues. So perhaps this is not a firmware issue at all!

chobitsfan commented 3 years ago

Hi @RossHacquebard Thank you for helping testing

The original issue, companion computer disconnecting when acquiring 3D fix, was fixed by switching to UART2.

I think maybe I could add some notes to KakuteF7 ardupilot wiki, to instruct users to use UART2, instead of UART1.