bnjmnp / pysoem

Cython wrapper for the Simple Open EtherCAT Master Library
MIT License
96 stars 37 forks source link

【need help】confused about the incorrect wkc #58

Closed tom9672 closed 2 years ago

tom9672 commented 2 years ago

Sometimes the inconnect wkc occurs, and the check/recover fucntion useless. I'm not clear about what make this error happen. Could some one let me know the reason of 'inconnect wkc', that will help me a lot. Thank you so much!

tom9672 commented 2 years ago

what lead to incorrect wkc? @bnjmnp

  1. the quality of servo devices
  2. not real time hardware or not realtime system
  3. code (If the reason is the code, why the error occured randomly, sometimes it works well, sometimes it happeded.)
  4. other

It sometimes happened when i was stop a thread and start another thread. the pdo thread is always on, I start one thread to waiting for keyboard input, and this thread has a child thread that give target pos and something to let the device move. After this setp, i would click a button that stop this child thread and this thread. Th incorrect wkc happened sometimes while start or stop these thread. sometimes it happeded while I read (I tried both sdo and pdo read) current position while one threading was pdo loop and another thread was do the move cmd things.

Anyway, the incorrect make the devices unable work well.... The code is on the RPi4 not at my hand, I will upload the code later.

If I could know what makes the 'incorrect wkc', I may can solve the issue.

Thank you so much if could let me know the reason....

tom9672 commented 2 years ago
class Machinework:
    ...
    def move_handheld(self):
        self.move_handheld_thread_stop_event = threading.Event()
        if self.all_slaves_reached_op_state:
            self.move_handheld_thread = threading.Thread(target=self._handheld)
            self.move_handheld_thread.start()

    def move_handheld_stop(self):
        self.move_handheld_thread_stop_event.set()
        self.move_handheld_thread.join()

    def _handheld(self):
        self._master.in_op = True
        output_data = OutputPdo()
        try:
            for cmd in [6,7,15]:
                output_data.Control_Word = cmd
                self.slave1.output = bytes(output_data)
                self.slave2.output = bytes(output_data)
                self.slave3.output = bytes(output_data)
                time.sleep(0.02)
            output_data.Mode_Operation = 1
            s1_start_pos = self._convert_input_data(self.slave1.input).actual_position
            s2_start_pos = self._convert_input_data(self.slave2.input).actual_position
            s3_start_pos =self._convert_input_data(self.slave3.input).actual_position
            prev_axis = self.axis
            while not self.move_handheld_thread_stop_event.is_set():
                if self.axis != prev_axis or self.timeout:
                    s1_start_pos = self._convert_input_data(self.slave1.input).actual_position
                    s2_start_pos = self._convert_input_data(self.slave2.input).actual_position
                    s3_start_pos =self._convert_input_data(self.slave3.input).actual_position
                    prev_axis = self.axis
                    self.timeout = False
                time.sleep(0.1)
                if self.axis == 'x':
                    output_data.target_speed = self.target_speed
                    output_data.Target_Position = self.target_pos + s3_start_pos
                    for cmd in [47,63]:
                        output_data.Control_Word = cmd
                        self.slave3.output = bytes(output_data)
                        time.sleep(0.02)
                elif self.axis == 'y':
                    output_data.target_speed = self.target_speed
                    output_data.Target_Position = self.target_pos + s1_start_pos
                    for cmd in [47,63]:
                        output_data.Control_Word = cmd
                        self.slave1.output = bytes(output_data)
                        time.sleep(0.02)
                elif self.axis == 'z':
                    output_data.target_speed = self.target_speed
                    output_data.Target_Position = self.target_pos + s2_start_pos
                    for cmd in [47,63]:
                        output_data.Control_Word = cmd
                        self.slave2.output = bytes(output_data)
                        time.sleep(0.02)

            self.slave2.output = bytes(len(self.slave2.output)) 
            self.slave1.output = bytes(len(self.slave1.output)) 
            self.slave3.output = bytes(len(self.slave3.output))
            for cmd in [15]:
                output_data.Control_Word = cmd
                self.slave1.output = bytes(output_data)
                self.slave2.output = bytes(output_data)
                self.slave3.output = bytes(output_data)
                time.sleep(0.02)

        except KeyboardInterrupt:
            self.slave1.output = bytes(len(self.slave1.output)) 
            self.slave2.output = bytes(len(self.slave2.output)) 
            self.slave3.output = bytes(len(self.slave3.output)) 
            time.sleep(0.2)
            print('stopped')

    def check_all_pos(self):
        return ctypes.c_int32.from_buffer_copy(self.slave3.sdo_read(0x6064,0x0)).value,ctypes.c_int32.from_buffer_copy(self.slave1.sdo_read(0x6064,0x0)).value,ctypes.c_int32.from_buffer_copy(self.slave2.sdo_read(0x6064,0x0)).value
        # I also tried to use pdo input, to get the positions, but didn't solve the incorrect wkc err.
class Window:
    ...
    def run(self):
        dev_axis = self.check_axis()

        A = GPIO.input(self.a)
        B = GPIO.input(self.b)

        dataA = deque([9,9,9,9])
        dataB = deque([9,9,9,9])

        self.machinework.axis = dev_axis

       # here i call the move_handheld, it will create a thread, servo waiting for the signal from handheld
        self.machinework.move_handheld() 

        pre_axis = self.check_axis()
        start_time = time.time()

        while not self.hhThread_stop_event.is_set():
            # deal with the hand held signal, and set speed, pos ...... to let the servo motor move

    def start_handheld(self):
        self.hhThread_stop_event.clear()
        self.hhThread = threading.Thread(target=self.run)
        self.hhThread.start()

    def stop_hand_control_func(self):
        # here I call the move_handheld_stop to stop the thread, servo not waiting signal from handheld
        self.machinework.move_handheld_stop()
        # stop reading signal loop
        self.hhThread_stop_event.set()
        self.hhThread.join()

The handheld is a wheel that can make pulse.

The incorrect wkc happened while using the 'start_handheld' or 'stop_hand_control_func', or 'check_all_pos' while 'start_handheld'. ( _actual_wkc is 3, and the expected_wkc is 9)

I tried to use reatime kernel, not solve the incorrect wkc issue.

Most time, it works well as expected. But the incorrect wkc will happen suddenly. The check pdo thread will make the servo go to preop status, but I already lose control of servo.

@bnjmnp @bnjmnp Do you know what will lead to incorrect wkc. Thanks in advance.

bnjmnp commented 2 years ago

Yeah, the issues you have are maybe because of the lack of realtime in your system. I know some EtherCAT devices that use DC/SYNC0 synchronization fall back to SafeOP state after a while, because theses devices count the situations where PDO cycle and SYNC0 event are not in sync. The basic_example.py therefore has a thread _check_slave that kicks back the device into OP state. You may have read that in general it is hard to achive realtime. For the Raspberry Pi the problem starts with the onboard Ethernet port being an USB device.

On the other hand it is really hard to check your code, can you shrink the code down to a minimal example? And if you would add "py" to you code blocks:

```py
def foo():
    pass
```

This would result in syntax highlighted code blocks:

def foo():
    pass

What in turn is much better do read.

tom9672 commented 2 years ago

A minimal example : (I will supply the full code if someone want to know how to use a handheld pulse wheel to control the servo.)

class Machinework:
    ...
    def move_handheld(self):
        self.move_handheld_thread_stop_event = threading.Event()
        if self.all_slaves_reached_op_state:
            self.move_handheld_thread = threading.Thread(target=self._handheld)
            self.move_handheld_thread.start()

    def move_handheld_stop(self):
        self.move_handheld_thread_stop_event.set()
        self.move_handheld_thread.join()

    def _handheld(self):
        # while loop to check signal from handheld pulse  wheel 

    def check_all_pos(self):
        # tried both sdo and pdo way to read postions, both cause incorrect wkc sometimes.

    def pdo_loop(self):
        # same as the 'basic_example.py'
    def pdo_check_loop(self):
        # same as the 'basic_example.py'
class Window:
    ...
    def run(self):
       # here i call the move_handheld, it will create a thread, servo waiting for the signal from handheld
        self.machinework.move_handheld() 

        while:
            #loop to get singal from handheld

    def start_handheld(self):
        self.hhThread_stop_event.clear()
        self.hhThread = threading.Thread(target=self.run)
        self.hhThread.start()

    def stop_hand_control_func(self):
        # here I call the move_handheld_stop to stop the thread, servo not waiting signal from handheld
        self.machinework.move_handheld_stop()
        # stop reading signal loop
        self.hhThread_stop_event.set()
        self.hhThread.join()

The incorrect wkc happened while using the 'start_handheld' or 'stop_hand_control_func', or 'check_all_pos' while 'start_handheld'. ( _actual_wkc is 3, and the expected_wkc is 9)

Hi @bnjmnp, I use a RPi 4B , it seems use gigabit network port onboard,Does the onboard Ethernet port lead to the problem is 'incorrect wkc' ? Or the delay of system? I use sudo ./cyclictest -t 5 -p 80 to test the realtime of the system, around min =8, max=120, avg=15. Do you have any other board for suggestion?

I want to make the device works without the 'incorrect wkc' problem.

bnjmnp commented 2 years ago

You could try to increase the timeout that is given to receive_processdata(). Maybe this helps.

tom9672 commented 2 years ago

yes, will try to increase timeout of receive_processdata(). Do you think using subprocess may doing better than threading? As i know multithreading can only use one cpu core, it seems multiprocessing or subprocess can use run multi python process and use multi core. I'm not sure about it, will it do better if i tring to make the program using multi cpu core? In mutli process program, the data exchange seems a difficulty.

bnjmnp commented 2 years ago

I don't think multiprocessing will work or it will help to achieve realtime. I also don't have any other board in mind that solves the Ethernet port problem. Probably one could achieve realtime with MicroPython and a MicroPython board but PySOEM will not install into a MicroPython distribution.

tom9672 commented 2 years ago

@bnjmnp Thank you.

yes, I tried to increase receive_processdata in pdo loop from 10000 to 20000, but still exist the problem of incorrect wkc, but seems not so frequency now. I use a for loop to do the start sleep stop sleep to do the test, it happened the incorrect wkc over 100 times in the test loop. Not very often, but it would be really nice if it could not happen.

'_actual_wkc is 3, and the expected_wkc is 9' what is the mean of 3 and 9.

Will the GUI refresh or log or anyother stuff slow down or crash the system? is that maight a reason for incorrect wkc?

incorrect wkc 3 9
checking
ERROR : slave 0 is in SAFE_OP + ERROR, attempting ack.
ERROR : slave 1 is in SAFE_OP + ERROR, attempting ack.
ERROR : slave 2 is in SAFE_OP + ERROR, attempting ack.
checking
WARNING : slave 0 is in SAFE_OP, try change to OPERATIONAL.
WARNING : slave 1 is in SAFE_OP, try change to OPERATIONAL.
WARNING : slave 2 is in SAFE_OP, try change to OPERATIONAL.
checking
OK : all slaves resumed OPERATIONAL.

SAFE_OP + ERROR, how to check what is the ERROR, Besides the system/network delay, Is there any other possible reasons?

tom9672 commented 2 years ago

The issue is still not solved, not clear about the reasons of this issue. Has anyone else encountered this problem?

tom9672 commented 2 years ago

Maybe not caused by delay of ethernet port or system realtime kernel. I use sudo top to check the cpu, while I using the handheld, the %cpu of python is 100%(total %cpu is around30%, another three core is just watching one core work.....). This maight be the reason that lead to the incorrect wkc, how do you think? @bnjmnp . In my code, to use a handheld pulse wheel to control servo motor, I use one while loop (using a thread) to get the axis, speed scale, analyze A,B pulse and get direction. Then another while loop (in another thread) to give cmd and target position and speed to servo, and using one while loop in a thread to let one axis runs. It can make the servo motor runs very smoothly as expected. But make the cpu using 100%, Not sure, just guess this make the problem of incorrect wkc. Now I'm trying to change the functions, trying to use only one while loop, or add sleeptime. Hope things will work as expected. Thank you bnjmnp, Hope you have a nice weekends.