OpenEtherCATsociety / SOES

Simple Open Source EtherCAT Slave
Other
578 stars 249 forks source link

If spi communication fails, infinite do-while is executed. #36

Closed KwonTae-young closed 6 years ago

KwonTae-young commented 6 years ago

Hello.

I have found an unusual situation. I removed the EtherCAT module (LAN9252) connected by spi while SOES was running on linux. In this case, hal source continues to do-while. https://github.com/OpenEtherCATsociety/SOES/blob/56a4dbf247d2efea09602c36745eadb5628f4dca/soes/hal/linux-lan9252/esc_hw.c#L245-L248

On my board, if ESC is not connected with spi, 0x0 is read. spi_read_fail

If the spi communication is unstable during the SOES operation, or if the module is removed, do-while will be executed indefinitely at that part. So SOES stops at that part.

On the other hand, if there is a CPU that reads 0xffffffff instead of 0x0 when reading with spi, the infinite do-while will be executed in the following part. https://github.com/OpenEtherCATsociety/SOES/blob/56a4dbf247d2efea09602c36745eadb5628f4dca/soes/hal/linux-lan9252/esc_hw.c#L234-L237

I modified the source and bypassed the problem as shown below. https://github.com/xoduddk123/SOES/commit/d9ea6443b0f7d9212dc994a86289ea0cc974cef6 In the case that the module is removed from the source, infinite do-while is executed and the SOES does not stop.

What do you think about the issue? I think that the solution is necessary even if the method is not correct.

I'm sorry I do not have enough English. :)

Thanks!

nakarlsson commented 6 years ago

I think this should be handled by the watchdogs,

nakarlsson commented 6 years ago

of cource you're free to check the BYTE_TEST and exit the loop to handle the issue elsewhere, I think this issue have multiple solutions based on the application

KwonTae-young commented 6 years ago

Hello.

If the LAN9252 module is removed: The application has already entered esc_hw.c and is going through a do-while loop. So, I can not get to the place where I can distinguish watchdog. How can I handle it at the application side? https://github.com/OpenEtherCATsociety/SOES/blob/56a4dbf247d2efea09602c36745eadb5628f4dca/applications/linux_lan9252demo/slave.c#L153-L161

I thought like this: ① Check BYTE_TEST in esc_hw.c and exit loop if module is not connected. ② Outputs an error about module disconnect.

I needed to modify esc_hw.c to escape the do-while loop of ① to handle this situation. Is there a way that the application can handle without modifying esc_hw.c? Can you give me a hint?

nakarlsson commented 6 years ago

What do you do after 2? In a real world application I'd say you'd like to reset the slave if you lost communication with the ESC, such reset might be triggered by a HW watchdog. If SOES would hang it should trigger such HW watchdog. But this is application dependent I'd say.

KwonTae-young commented 6 years ago

② Thereafter, the usual soes() loop continues. https://github.com/OpenEtherCATsociety/SOES/blob/56a4dbf247d2efea09602c36745eadb5628f4dca/applications/linux_lan9252demo/slave.c#L207-L210 After that, when the ESC was physically reconnected, the operation was normal.

I was confused because HW watcdog would not work if ESC was not physically connected. example) Disconnecting the Connector I decided not to think that ESC would be physically disconnected. It is really rare for a connector to be disconnected or physically destroyed.

Thank you for your feedback.

Regards Kwon

nakarlsson commented 6 years ago

I think it is up to you howto implement it, I find a scenario where the CPU<->ESC have been disconnected and reconnected quite tricky to handle, for your scenario to work you must be sure the ESC didn't loose power and got reset values? Is it some hotswap module project?

KwonTae-young commented 6 years ago

I do not know what a hotswap module is. However, I am connecting ESC using the following connector. image https://www.digikey.tw/product-detail/en/hirose-electric-co-ltd/FX20-40P-0.5SV20/H122389-ND/5017644 https://www.digikey.tw/products/en?keywords=FX20-40S-0.5SV10

The ESC and CPU are connected through the connector. (VCC, GND, SPI_CLK, SPI_MISO, SPI_MOSI, SPI_CS)

nakarlsson commented 6 years ago

Do you cut the power to the ESC when disconnected?

KwonTae-young commented 6 years ago

Yes. Module format. All six connections to the connector are disconnected.

nakarlsson commented 6 years ago

Ok, I guess the master re-program the slave again. Is it OK to close this issue? It is OK if you add the byte test code to the linux port.

KwonTae-young commented 6 years ago

Yes. A lot has helped. Thank you.