Open windsgo opened 8 months ago
SOEM adds very little delay to the cycle. Most of it is used up in the Linux kernel network stack on receive. Optimizing packet receive to user space handover is a topic that is well described on the internet. Your friend is ethtool, see drvcomment.txt.
On the other hand it is not most optimal to send a packet and then wait for it to return (as you are doing).
Your situation :
- start cycle - send process data - receive process data - calculations - wait for next cycle start -
Optimal solution 1 :
- start cycle - receive process data - send process data - calculations - wait for next cycle start -
Optimal solution 2 :
- start cycle - receive process data - calculations - send process data - wait for next cycle start -
Solution 1 optimizes compute efficiency, solution 2 optimizes calculation to setpoint delay.
B.t.w., using CSP is not a very good control mode. You have very little control over velocity and even less over acceleration and jerk.
Thank you very much for your suggestions and replies. I will try it, about the NIC optimized (really grateful for your suggestions) and the cycle way. Since I'm having some delay problems here, if the 230us delay continues to be a problem, I may have a problem using CSV control. I think CSV may need a higher frequency cycle. Anyway thanks for your suggestions.
I hope that SOEM really adds very little delay to the cycle . But I found a phenomenon that when I use CMake Release
build, I got about 127us delay with the ec_xxx
functions, which is half of which when I use CMake Debug
build (230us as mentioned above). I import SOEM to my project using add_subdirectory()
, so SOEM should be affected by the CMAKE_BUILD_TYPE
I think. So does this phenomenon imply that there IS some (not so little) delay from the SOEM code itself?
I'm trying the Optimal solution 1
.
I called ec_send_progressdata
immediately after ec_receive_progressdata
, I found a problem that: the ec_receive_progressdata
will clear some of my Tx PDO
data, specifically speaking, the ControlWord
of the motor in the PDO I want to send. I set all the Tx PDO
data in the progress calculation
in Optimal solution 1
. Why this "clear data" behaviour happens?
I found I need to set all the TxPDO
data between the ec_receive_progressdata
and ec_send_progressdata
, the previous comment is my wrong programming sequence sorry. After I use the Optimal solution 1
, the whole peroid takes below 10us
latency, EVEN on my generic NIC driver and External USB network device , specifically the minimum is about 5~6us, that's obviously much much better performance than what I used to work.
Thanks again for your optimal solutions. Very appreciate.
@ArthurKetels
Sorry to at you again.
I found that when I call ec_receive_progressdata(200)
, the timeout value 200us
seems not to work. I'm occasionally blocking by this function by above 2000us each time (including the call to ec_send_progressdara
, but send should not block too long I think). It seems this timeout
param does not work properly?
This delay is not part of SOEM but the kernel socket recv() function. The socket is created as non blocking and with a maximum delay of 1 us. But is is up to the actual NIC driver to honour this. Do you use the PREEMPT-RT kernel? And if so, what is your priority? Then there are known problems with NMI (for BIOS power management) that can generate latencies up to 2ms.
First check your Linux system for latency performance. Then check how much extra SOEM packs on top of that.
I'm using PREEMPT-RT
kernel under Ubuntu22.04(the official rt-kernel by Ubuntu), thread is scheduled as SCHED_FIFO
policy, and priority 99
. Also I isolate the cpu 2 and cpu 3 in kernel start-up cmdline(isolcpus=2,3
) while set the nohz_full=2,3
(I have logically 8 cpus), and give my "real-time needed thread" an affinity on cpu2. Besides, I write 0
to /dev/cpu_dma_latency
just as cyclictest
do and it has obvious effect.
When I use cyclicttest
(from rt-tests
tools), this 2ms
delay never occurs (maximum in 24 hours under pressure test is about 100us).
Perhaps this is caused by what you said the NMI problem. So how to do with it ? I have turned off the BIOS options like c-state
, Intel SpeedStep
and intel speed shift
. (I'm using Intel Cpu i7-6700, bios is american megatrends bios I think)
And what do you mean by "check how much extra SOEM packs on top of that"? By the way, this 2ms delay happens in a frequency about 5min to 30min once, not so frequent but also somewhat frequent.
I've read the function ec_read
and I also realized that this may be blocked by the socket recv()
function.
/* we use RAW packet socket, with packet type ETH_P_ECAT */
*psock = socket(PF_PACKET, SOCK_RAW, htons(ETH_P_ECAT));
timeout.tv_sec = 0;
timeout.tv_usec = 1;
r = setsockopt(*psock, SOL_SOCKET, SO_RCVTIMEO, &timeout, sizeof(timeout));
r = setsockopt(*psock, SOL_SOCKET, SO_SNDTIMEO, &timeout, sizeof(timeout));
i = 1;
r = setsockopt(*psock, SOL_SOCKET, SO_DONTROUTE, &i, sizeof(i));
Seems the socket isnot completely non-block configured. So is it possible to make a socket with flags like SOCK_NONBLOCK
or O_NONBLOCK
?
By the way when I use my laptop computer, which has an AMD R7-5800H
cpu, this seldom happens XD, is this an intel problem?
I'd like to add some additional information. I logged when the timeout went large as this 2us. It is always caused in the routine of the ec_x
functions I called. Never caused by the function clock_nanosleep
.
@windsgo Hello,have you already solved this problem? I also encountered the same problem when using it.
I have solevd this problem. I use the the Aconist Kernel module atemsys (https://github.com/acontis/atemsys) and develop userspace NIC driver . In my project the latency is less than 50us.
@toshisanro Thanks for your reply, it will help me a lot. During my recent testing, I found that after adding a large number of PDOs, sending and receiving data will take up a lot of time, causing the entire communication cycle to be greatly jittered. Have you ever done any relevant tests?
@windsgo Hello,have you already solved this problem? I also encountered the same problem when using it.
I think the latency is caused by the linux kernel network stack and the generic nic driver. So instead of using SOEM, I tried IGH with modified NIC driver, which works.
@windsgo Hello,have you already solved this problem? I also encountered the same problem when using it.
I think the latency is caused by the linux kernel network stack and the generic nic driver. So instead of using SOEM, I tried IGH with modified NIC driver, which works. I think so too. I use the Intel generic nic driver. After adding a large amount of PDO data, the data send and rece will be greatly jittered. But in the nicdrv.c soem used the raw socket , I'm not sure if it's caused by IRQ or other reasons.I haven't use IGH yet, how is it working now?
@windsgo Hello,have you already solved this problem? I also encountered the same problem when using it.
I think the latency is caused by the linux kernel network stack and the generic nic driver. So instead of using SOEM, I tried IGH with modified NIC driver, which works. I think so too. I use the Intel generic nic driver. After adding a large amount of PDO data, the data send and rece will be greatly jittered. But in the nicdrv.c soem used the raw socket , I'm not sure if it's caused by IRQ or other reasons.I haven't use IGH yet, how is it working now?
SOEM seems to use raw socket with a receive timeout of 1us, which I think may be blocked by the kernel network stack during one of the call to the system call 'recv' in ec_receive_processdata(timeout). IGH has a kernel module which operates the hardware directly through the modified nic driver. It can be used in userspace through the character device interface (ecrt_xxx) provided by the IGH project.
@windsgo Hello,have you already solved this problem? I also encountered the same problem when using it.
I think the latency is caused by the linux kernel network stack and the generic nic driver. So instead of using SOEM, I tried IGH with modified NIC driver, which works. I think so too. I use the Intel generic nic driver. After adding a large amount of PDO data, the data send and rece will be greatly jittered. But in the nicdrv.c soem used the raw socket , I'm not sure if it's caused by IRQ or other reasons.I haven't use IGH yet, how is it working now?
I think it is not recommended to send and receive too much data through the PDO in realtime and high-frequecy environment. Anyway, I think the original SOEM project is not the best idea on linux, since SOEM itself does not ensure the direct operation to hardware on linux (this must be done with specific kernel modules on linux I think), while IGH is originally designed in the kernel space. For most network device, IGH v1.6.0 stable branch provides modified NIC driver, which is easy to use.
@Neverforgetlove I use both the xenomai and Preempt RT for testing, I add six servo drivers about 168 bytes input PDOs and 96 byes outputs , cycle time 1ms with DC. The xenomai's result is better than RT ,RT always cause servo loss DC. Below is xenomai results compared the socket with my userspace NIC driver. ec_receive max latency is 84 us(my driver) ,500us(socket) ,avg less than 50us(my driver), cycle jitters less than 30 us.
Additionally, the low-power mode of the CPU may lead to increased latency.
@Neverforgetlove I use both the xenomai and Preempt RT for testing, I add six servo drivers about 168 bytes input PDOs and 96 byes outputs , cycle time 1ms with DC. The xenomai's result is better than RT ,RT always cause servo loss DC. Below is xenomai results compared the socket with my userspace NIC driver. ec_receive max latency is 84 us(my driver) ,500us(socket) ,avg less than 50us(my driver), cycle jitters less than 30 us. That sounds great.I have over 700 bytes input PDOs,a large amount of PDO data will cause a large delay in send and rece data, resulting a large cycle jitter(maximum over 200us),this is unacceptable.
@Neverforgetlove I use both the xenomai and Preempt RT for testing, I add six servo drivers about 168 bytes input PDOs and 96 byes outputs , cycle time 1ms with DC. The xenomai's result is better than RT ,RT always cause servo loss DC. Below is xenomai results compared the socket with my userspace NIC driver. ec_receive max latency is 84 us(my driver) ,500us(socket) ,avg less than 50us(my driver), cycle jitters less than 30 us. That sounds great.I have over 700 bytes input PDOs,a large amount of PDO data will cause a large delay in send and rece data, resulting a large cycle jitter(maximum over 200us),this is unacceptable.
你的通信周期目前是多少?其实可以试一下igh我觉得,我现在igh用户态也是可以用的。用soem就要自己折腾用户态的驱动和内核模块,还蛮麻烦的其实。
@Neverforgetlove I use both the xenomai and Preempt RT for testing, I add six servo drivers about 168 bytes input PDOs and 96 byes outputs , cycle time 1ms with DC. The xenomai's result is better than RT ,RT always cause servo loss DC. Below is xenomai results compared the socket with my userspace NIC driver. ec_receive max latency is 84 us(my driver) ,500us(socket) ,avg less than 50us(my driver), cycle jitters less than 30 us. That sounds great.I have over 700 bytes input PDOs,a large amount of PDO data will cause a large delay in send and rece data, resulting a large cycle jitter(maximum over 200us),this is unacceptable.
你的通信周期目前是多少?其实可以试一下igh我觉得,我现在igh用户态也是可以用的。用soem就要自己折腾用户态的驱动和内核模块,还蛮麻烦的其实。
4ms,在codesys上测试有很理想的结果,我不知道它里面做了什么处理,之后我会尝试用igh测试一下
Please guys, the forum is English language only. Others also want to follow the conversation.
I'm using
SOEM
to drive a Panasonic motor. The following is how I am working withSOEM
.ec_send_progressdata
at the begining of each 1ms period to reduce communication latencyec_receive_progressdata
immediately after I callec_send_progressdata
.These 2 function cause about 230us.
My Question is:
IGH
project, which is about how to modify the NIC driver and use it with IGH. But I'm still confused how Specifically can I do withSOEM
about the NIC driver ? (Sorry, I know little about the linux driver programming and using, I want to know how generally does this work ?)Preempt RT
patched linux kernel ? What is the anticipated performance in this situation?ec_read
immediately afterec_write
. Note that I must callec_send
first at the beginning of each peroid to control the motor (with CSP control mode).Many thanks