[Bug] SiK radios can not assert CTS to a proper level and may get Stuck in bootloader mode

davids5 commented 1 year ago

Describe the bug

We added Pulldowns in FW to the CTS pin on some of the board. This was done to ensure an unconnected port, that caused CTS to float HIGH (Not clear to send) would not cause mavlink (or serial_test) from blocking and require the process to be killed.

The pull down can cause the SiK to enter bootloader mode and not boot on power on.*
The pull down can prevent CTS/RTS handshaking from working.

I think we should back out the CTS pull down change.

*Harder to get to happen on stm32

Below are images with a old 3DR V2 radio connected to V6X

Yellow is TX from FMU Green is CTS into FMU

CTS high should stop the FMU TX

With Pulldown: No stopping of the FMU TX occurs while CTS is asserted (to wrong level).

No Pull down: Stopping of the FMU TX occurs while CTS is asserted (to correct level).

Thoughts?

@julianoes @niklaut @PetervdPerk-NXP

To Reproduce

See above

Expected behavior

Signal meets threshold.

Screenshot / Media

No response

Flight Log

None

Software Version

main

Flight controller

All

Vehicle type

None

How are the different components wired up (including port information)

No response

Additional context

No response

niklaut commented 1 year ago

Does it still happen if you use the internal pull UP on the CTS? Now that the TX semaphore problem is fixed, it should work fine.

PetervdPerk-NXP commented 1 year ago

From #22302 You could argue that the following is a user mistake right, enabling mavlink on a port that doesn't have a radio connected?

Deadlock Timeline The mavlink_if1 task is the first to send a message to TELEM1, which has nothing connected. This calls up_dma_txavailable which sets up the very first DMA TX transfer for UART7.

niklaut commented 1 year ago

You could argue that the following is a user mistake right,

No, because the interrupt-driven TX was non-blocking, and the TXDMA was also non-blocking on the STM32F7, just not on the STM32H7. The uart_write would return 0 bytes written in those cases and let the Mavlink layer decide how to handle that (it did polling). So this nxsem_wait violated that assumption. It would also be a problem if all mavlink threads were stuck each time the CTS is high, even if progress is eventually made.

davids5 commented 1 year ago

Does it still happen if you use the internal pull UP on the CTS? Now that the TX semaphore problem is fixed, it should work fine.

Sure it does CTS high is stop.

nsh> mavlink stop-all
INFO  [mavlink] waiting for instances to stop
.INFO  [mavlink] exiting channel 1
.......................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ERROR [mavlink] mavlink didn't stop, killing task 953
INFO  [mavlink] all instances stopped
nsh>

niklaut commented 1 year ago

I meant, does the SiK radio work better if the instead of a pull down we use a pull up? I think it's only important to have a defined state on these inputs, doesn't really matter if it's high or low.

davids5 commented 1 year ago

No it does matter. If it is pulled high the apps can be blocked. If it is pulled low, the app will not block. That was why it was done.

davids5 commented 1 year ago

In an ideal world the HW connected would be a push/pull output and could source/sink enough current. But that is not the case here. The

I think we should add 1 M ohm pull downs in HW on the FMU(M)s and leave the pins as float in FW. This will establish a level and not cause blocking. It will also not swamp the a HW design with a ~100K up.

PX4 / PX4-Autopilot