OpenEtherCATsociety / SOEM

Simple Open Source EtherCAT Master
Other
1.23k stars 653 forks source link

When using SOEM to control ELMO Gold series Drive, the elmo will switch to switch-on-disabled state sometimes #812

Open crown133 opened 1 month ago

crown133 commented 1 month ago

Background: I'm trying to use the SOEM as the ethercat master for a 10-DoFs robot, and the actuator drive is elmo gold twitter.

Issues: When using SOEM to control elmo drive, the elmo can be initialized to operation-enabled state successfully and worked well, but it will switch to switch-on-disabled state sometimes.

I solved this problem by booting a real-time linux kernel, setting the sheduler with 'SCHED_FIFO' and the thread priority is 50, and also set the cpu affinity. However, when I simultaneously running a high-level controller in another thread, the problem appeared again.

I want to know why the elmo switched from operation-enabled state to switch-on-disabled state automatically even I didn't change the 'controword'. And also why the thread priority or other reasons will cause this problem?

system configuration: Ubuntu20.04 linux-5.15.158 kernel and corresponding RT patches

I'll be appreciate it if you can give me some advice.

crown133 commented 1 month ago

By the way, I put 'ecx_send_processdata()' and 'ecx_receive_processdata()' in the same thread sequentially.

ArthurKetels commented 1 month ago

Read some status register from the Elmo?

Most likely you had some communication disruption and could not maintain synchronous PDO. You can monitor the the timing of your real-time PDO loop for jitter and drop-outs. Then it is a question of searching for the cause. The fact that your PDO task is on a isolated CPU does not mean it can not block. It depends on the Linux network stack to send and receive packets. If there is some contention on resources in the Linux kernel you still end up with latency.

Does the controller task use sockets or any other network related resources? Do you use IO bandwidth on the same bus as the NIC?

crown133 commented 1 month ago

Read some status register from the Elmo?

Most likely you had some communication disruption and could not maintain synchronous PDO. You can monitor the the timing of your real-time PDO loop for jitter and drop-outs. Then it is a question of searching for the cause. The fact that your PDO task is on a isolated CPU does not mean it can not block. It depends on the Linux network stack to send and receive packets. If there is some contention on resources in the Linux kernel you still end up with latency.

Does the controller task use sockets or any other network related resources? Do you use IO bandwidth on the same bus as the NIC?

Thanks for your through analyzation. I've read the error code object (0x603f) of elmo by SDO and also monitored the elmo's state via EASII. There is no error occur. I monitored the jitter, and found that there are more jitters when the controller thread runs. I wanna know how to monitor the drop-outs. I checked again to ensure that there is no other task using sockets or any other network related resources. I even turned off the wifi and bluetooth. It's still not work. Hopping there are more suggestions.

crown133 commented 1 month ago

I also find that when the cycle time of receiving and writting is set to 20ms, the state of elmo would be very stable. And It is unstable if the cycle time is less than 2.5ms.