OpenEtherCATsociety / SOEM

Simple Open Source EtherCAT Master
Other
1.3k stars 668 forks source link

send and receive processdata Timing has Noise. #597

Closed human2154 closed 2 years ago

human2154 commented 2 years ago

Hello. I'm a beginner at EtherCAT. And thank you for developing SOEM.

I have some problems with Ethercat timing. And please help me to solve this problem. I use 'simpletest' code to initialize my EtherCAT slave 'ELMO Platinum Motor controller' and want to use it in 1kHz. And I use 'hilscher netANALYZER' for analysis Timing. And I refer #146 issue for solve my problem.

  1. I use default 'simpletest' code -> I can't read & write slave's data

  2. I add ec_slave[slc].blockLRW = 1; after PDO mapping and the timing shows in below figure. -> At the below right graph, There are two lines at 0us and 1000us. It means the data send twice. Platinum

  3. I changed all of ec_send_processdata(); and ec_config_map(&IOmap); to ec_send_overlap_processdata(); and ec_config_overlap_map(&IOmap); at 'simpletest' and 'ecatthread' -> I can not initialize my slave

  4. I remove the ec_slave[slc].blockLRW = 1; -> I can read&write data And It works 1kHz! -> But another issue happened. The timing has periodic noise like nail marks. And it is shown in the below right figure. -> This EtherCAT noise disturbed the motor control signal and It makes a periodic noisy sound at my motor. (The cycle of Ecat noise and sound-noise are the same.) 20220302_172232

This is my ethercatthread and changed simpletest code. Am I changed wrong about LRW and overlap function? Please help me. Thank you.

void *ecatthread(void *ptr)
{
    struct timespec ts, tleft;
    int ht;
    int cycletime;
    int toff = 0;
    struct sched_param p;
    p.sched_priority = sched_get_priority_max(SCHED_FIFO);
    pthread_setschedparam(pthread_self(), SCHED_FIFO, &p);
    clock_gettime(CLOCK_MONOTONIC, &ts);
    ht = (ts.tv_nsec / 1000000) + 1; /* round to nearest ms */
    ts.tv_nsec = ht * 1000000;
    cycletime = *(int *)ptr * 1000; /* cycletime in ns */
    printf("create ecatthread! \n");
    printf("pthread priority = %d\n", p.sched_priority);

    ec_send_overlap_processdata();
    //-----------------------------------------------------------------------------------------
    while (1)
    {
        /* calculate next cycle start */
        add_timespec(&ts, cycletime + toff);
        /* wait to cycle start */
        clock_nanosleep(CLOCK_MONOTONIC, TIMER_ABSTIME, &ts, &tleft);

        // pthread_mutex_lock(&mutex);
        wkc = ec_receive_processdata(EC_TIMEOUTRET);

        if (ec_slave[0].hasdc)
        {
            ec_sync(ec_DCtime, cycletime, &toff);
        }
        ec_send_overlap_processdata();    
    }
}
bool simpletest(char *ifname)
{
    int i, j, oloop, iloop, chk, slc;
    u_int mmResult;
    needlf = FALSE;
    inOP = FALSE;
    printf("Starting simple test\n");
    /* initialise SOEM, bind socket to ifname */
    if (ec_init(ifname)) // Network Interface Card NIC
    {
        printf("ec_init on %s succeeded.\n", ifname);
        /* find and auto-config slaves */
        if (ec_config_init(FALSE) > 0) // Mailbox Setup, Slave -> PRE_OP Mode, All data read and configured are stored in a global array
        {
            printf("%d slaves found and configured.\n", ec_slavecount);
            if ((ec_slavecount >= 1)) // ec_slabecount : number of slaves found on the network
            {
                for (slc = 1; slc <= ec_slavecount; slc++) // ec_slave[0] : reserved for the master
                {
                    printf("Name: %s EEpMan: %d eep_id: %d configadr: %d aliasadr: %d State %d\n\r ",
                           ec_slave[slc].name, ec_slave[slc].eep_man, ec_slave[slc].eep_id, ec_slave[slc].configadr, ec_slave[slc].aliasadr, ec_slave[slc].state);
                    if (ec_slave[slc].eep_man == 154 && ec_slave[slc].eep_id == 17825794) // ELMO PLATINUM's man & id
                    {
                        printf("slave%d -> ELMO PLATINUM detected\n", slc - 1);
                        ec_slave[slc].PO2SOconfig = &ELMOsetupPlatinum;
                        // ec_slave[slc].blockLRW = 1; // Platinum does not support LRW function
                    }
                    else 
                    {
                        printf("unknown Slave\n");
                    }
                }
            }
            ec_config_overlap_map(&IOmap); // 32 Map all PDOs from slaves to IOmap with Outputs / Inputs in sequential order(legacy SOEM way).
            ec_configdc();

            printf("Slaves mapped, state to SAFE_OP.\n");
            /* wait for all slaves to reach SAFE_OP state , When the mapping is done
            SOEM requests slaves to enter SAFE_OP.*/
            ec_statecheck(0, EC_STATE_SAFE_OP, EC_TIMEOUTSTATE * 4);
            // Operation Mode
            oloop = ec_slave[0].Obytes;
            if ((oloop == 0) && (ec_slave[0].Obits > 0))
                oloop = 1;
            if (oloop > 8)
                oloop = 8;
            iloop = ec_slave[0].Ibytes;
            if ((iloop == 0) && (ec_slave[0].Ibits > 0))
                iloop = 1;
            if (iloop > 8)
                iloop = 8;
            printf("segments : %d : %d %d %d %d\n", ec_group[0].nsegments,
                   ec_group[0].IOsegment[0], ec_group[0].IOsegment[1], ec_group[0].IOsegment[2],
                   ec_group[0].IOsegment[3]);
            printf("Request operational state for all slaves\n");
            expectedWKC = (ec_group[0].outputsWKC * 2) + ec_group[0].inputsWKC;
            printf("Calculated workcounter %d\n", expectedWKC);
            ec_slave[0].state = EC_STATE_OPERATIONAL;
            ///////////////////////////////// 0 or 1
            /* send one valid process data to make outputs in slaves happy*/
            ec_send_overlap_processdata();
            ec_receive_processdata(EC_TIMEOUTRET);
            /* request OP state for all slaves */
            ec_writestate(0); ///////////////////////////////// 0 or 1
            chk = 200;
            /* wait for all slaves to reach OP state */
            do
            {
                ec_send_overlap_processdata();
                ec_receive_processdata(EC_TIMEOUTRET);
                ec_statecheck(0, EC_STATE_OPERATIONAL, 50000);
            } while (chk-- && (ec_slave[0].state != EC_STATE_OPERATIONAL));

            if (ec_slave[0].state == EC_STATE_OPERATIONAL)
            {
                printf("Operational state reached for all slaves.\n");
                inOP = TRUE;
                /* cyclic loop, reads data from RT thread */
                int PCL = 30; // Processdata Cycle Loop
                for (i = 1; i <= PCL; i++)
                {
                    ec_send_overlap_processdata();
                    wkc = ec_receive_processdata(EC_TIMEOUTRET);
                    if (wkc >= expectedWKC)
                    {
                        printf("Processdata cycle %4d, WKC %d , O:", i, wkc);
                        for (j = 0; j < oloop; j++)
                        {
                            printf(" %2.2x", *(ec_slave[0].outputs + j));
                        }
                        printf(" I:");
                        for (j = 0; j < iloop; j++)
                        {
                            printf(" %2.2x", *(ec_slave[0].inputs + j));
                        }
                        printf("\nOutput Byte: %d / input Byte: %d\n", oloop, iloop);
                        // printf(" T:%lld\r", ec_DCtime);
                        needlf = TRUE;
                    }
                    osal_usleep(50000);
                }
                inOP = FALSE;
            }
            else
            {
                printf("Not all slaves reached operational state.\n");
                ec_readstate();
                for (i = 1; i <= ec_slavecount; i++)
                {
                    if (ec_slave[i].state != EC_STATE_OPERATIONAL)
                    {
                        printf("Slave %d State=0x%2.2x StatusCode=0x%4.4x : %s\n",
                               i, ec_slave[i].state, ec_slave[i].ALstatuscode,
                               ec_ALstatuscode2string(ec_slave[i].ALstatuscode));
                    }
                }
                return false;
            }
            printf("\nRequest init state for all slaves\n");
            ec_slave[0].state = EC_STATE_INIT;
            /* request INIT state for all slaves */
            ec_writestate(0);
        }
        else
        {
            printf("No slaves found!\n");
            return false;
        }
        // printf("End simple test\n");
    }
    else
    {
        printf("No socket connection on %s\nExcecute as root\n", ifname);
        return false;
    }
    return true;
}
ArthurKetels commented 2 years ago

Thanks for the detailed information. You do not however mention your OS version. Is it a preempt-rt patched linux kernel? What version? What is the hardware you running this test on?

SOEM will not magically make your system real-time with no jitter. You first have to check the real-time capability of your system. Are there any sources of jitter? Is your NIC configured to have the lowest latency? Did you move the real time task to its own isolated CPU? What are the priorities of your NIC interrupts? On what CPU are they handled? How are the C states of your system configured? Do your run drivers (for example NVIDIA) that are known to cause latency spikes?

Further I would advice to read other posts here on the subject. You are not the first one asking this question.

human2154 commented 2 years ago

First, thank you for applying. And I will check another issue too. My goal is to control my motor without noise sound. But there's some noise from e-cat jitter.

So I delete almost codes, and ecatthreadjust work ec_receive and ec_send codes. In the main function, It just calls simpletest, makes EtherCAT thread, and does nothing. There are no more other functions in the code, it is just made for testing jitter.

My master device spec is like this. -Ubuntu 16.04 -ROS kinetic -Kernel : xenomai 4.9.146 -Ethernet Controller : Intel® 82574L Gigabit Ethernet Controller -CPU : i5 10400 -RAM : 8Gb -GPU : CPU Built-in graphics card. -Slave : ELMO Platinum motor controller

SOEM (Simple Open EtherCAT Master)
Slaveinfo
Starting slaveinfo
ec_init on rteth0 succeeded.
1 slaves found and configured.
Calculated workcounter 3

Slave:1
 Name:ModuleSlotsDrive
 Output size: 48bits
 Input size: 48bits
 State: 4
 Delay: 0[ns]
 Has DC: 1
 DCParentport:0
 Activeports:1.0.0.0
 Configured address: 1001
 Man: 0000009a ID: 01100002 Rev: 00120025
 SM0 A:1000 L: 256 F:00010026 Type:1
 SM1 A:1400 L: 256 F:00010022 Type:2
 SM2 A:1800 L:   6 F:00010064 Type:3
 SM3 A:1c00 L:   6 F:00010020 Type:4
 FMMU0 Ls:00000000 Ll:   6 Lsb:0 Leb:7 Ps:1800 Psb:0 Ty:02 Act:01
 FMMU1 Ls:00000006 Ll:   6 Lsb:0 Leb:7 Ps:1c00 Psb:0 Ty:01 Act:01
 FMMUfunc 0:1 1:2 2:3 3:0
 MBX length wr: 256 rd: 256 MBX protocols : 0e
 CoE details: 2f FoE details: 01 EoE details: 01 SoE details: 00
 Ebus current: 0[mA]
 only LRD/LWR:0
 CoE Object Description found, 492 entries.
 Index: 1000 Datatype: 0007 Objectcode: 07 Name: Device type
  Sub: 00 Datatype: 0007 Bitlength: 0020 Obj.access: 0007 Name: Device type
          Value :0x00020192 131474
 Index: 1001 Datatype: 0005 Objectcode: 07 Name: Error register
  Sub: 00 Datatype: 0005 Bitlength: 0008 Obj.access: 0007 Name: Error register
          Value :0x24 36

Additionally, when I delete the DC code it has no same jitter. But it makes weird data every second and this data makes noise too. In the figure's red box, there's some weird datas. Gold No DC

ArthurKetels commented 2 years ago

Your last graph is not about noise or timing jitter, it is skipping a complete cycle. It could be related to interrupt coalescing of our NIC driver. How do your NIC settings look like with ethtool?

What you could do to figure this out is to instrument the PDO loop. Take time stamps of the send_processdata and receive_processdata. Also check receive_processdata for timeouts (WKC < 0). This would show if the problem is with the tx or rx stack.

nakarlsson commented 2 years ago

@human2154 , can we close this issue?

human2154 commented 2 years ago

Thank you for applying my issue. The reason of this jitter is sleep functions. I change my Thread sleep and DC sleep functions, and solve this problem. Thank you for your interest.

windsgo commented 5 months ago

Thank you for applying my issue. The reason of this jitter is sleep functions. I change my Thread sleep and DC sleep functions, and solve this problem. Thank you for your interest.

Can I know how you changed your thread sleep and dc sleep functions? Is there anything wrong using the default clock_nanosleep function ? I'm having this problem too.