stlehmann / pyads

Python wrapper for TwinCAT ADS
MIT License
253 stars 94 forks source link

How to reconnect properly? #102

Closed emalinen closed 3 years ago

emalinen commented 4 years ago

I have made a hardware simulator script in Windows environment which should survive ADS errors and switching TwinCAT between config mode and run mode. Reconnection does not work by calling .open() and .close() or deleting the whole instance of Connection class and recreating it. Only way to get data moving again is to kill the whole script, but I don't want to have any external loop which restarts the process when the script exits. Is there any known bulletproof way to reconnect ADS?

stlehmann commented 4 years ago

Are you closing the connection on the side of the pyads client before you disconnect your TwinCAT host?

emalinen commented 4 years ago

No. I don't want to touch the script after I once start it, so it tries to close and restart the pyads connection after connection to TwinCAT is lost (ADS timeout etc).

emalinen commented 4 years ago

I investigated this a bit more and seems like the problem actually is that I stop getting callbacks after a while. Running the test code below with a local ADS server on my PC does not have any problems, but with an actual PLC pyads just stops calling the callback function after few seconds and I haven't found any way to restart it. Do you know what could cause this kind of behavior and how can I fix it?

import pyads
import time

AMS_ADDRESS         = 'x.xx.x.xxx.x.x'
AMS_PORT            = 851
VARIABLE_PATH       = 'some.variable'

plc = pyads.Connection(AMS_ADDRESS, AMS_PORT)
plc.open()

counter = 1
def callback(notification, data_name):
    global counter
    print('Callback called %d times' % counter)
    counter += 1

attr = pyads.NotificationAttrib(2,
                                trans_mode=pyads.ADSTRANS_SERVERCYCLE,
                                cycle_time=100 )

plc.add_device_notification(VARIABLE_PATH, attr, callback)

while(1):
    time.sleep(1)
Podmornica commented 4 years ago

I have the same issue. From my Linux PC I am reading and writing data on the PLC with the read_write command.

The problem is that if the connection is lost I am first getting "ADSError: timeout elapsed (1861)." and when the PLC is back on the network I can not reconnect to the PLC. I am getting the error "ADSError: Unknown Error (-1).".

I tried to close the connection and reopen the connection, but this is not solving the issue.

Is there any other command that should be issued beside close() so that the connection is truly closed or reset? Do I need to clear some other things?

Did somebody managed to reconnect successfully?

P.S.: If I simply restart the program everything is working, but I think that there should be a way to reconnect.

P.P.S.: This issue is only present if pyads is used in Linux, on Windows I can reconnect without any problems.

wyda commented 4 years ago

I managed to reconnect after a PLC restart/config mode. However without routing. So in case of a route to the plc some more steps are needed.

I think the point is to wait until ADS is started again (port open etc.). What i do is calling the open() methode and check if the connection is up again by calling read_state(). If the read state fails i just try again with open() followed by read_state() after a few seconds an so on.

Once connected again remember to reasing the notifications since they are not valid anymore after a PLC restart.

Try to delet all handles with del_device_notification() before the PLC shut's down if possible (state)

Below the code that worked in my case (without reasinging the notifications):

def __try_to_connect(self, trials, sleep_time_sec, callback, event):                      

        reconnected = False

        with tqdm(total=trials) as pbar:
            for trial in range(trials):
                if event.is_set():
                    print("close down event is set")
                    return
                time.sleep(sleep_time_sec)            

                try:
                    self.plc.open()
                    state = self.plc.read_state()                                    
                    if state[0] != 0:   
                        reconnected = True      
                        break                                                                                       
                except Exception as e:
                    pass

                pbar.update(1)

        callback(reconnected)       
stlehmann commented 4 years ago

Maybe it would make sense to handle reconnection within plc.open() method by adding a retries parameter.

wyda commented 4 years ago

I think that would be a nice option. Anyway it would be nice to get a feedback whether the plc.open() fuction successfully opened the connection or not. This would make cleare that this function call maybe can't open the connection and makes it easy to react upon the outcome.

I thought instead of a read_state() an easier way of checking the connection could be checking the returned port number but that seems not to be a valid indicator since as long as a message router is present it will return a valid port nummer.

frat3rius commented 4 years ago

I've got small script for home automation where I only use BK9050 with read/write methods (all lunched on raspberry pi, which is also crucial as from what I've read problem only occurs on linux). My script creates two threads for each connected BK (one for reading and one for writing), every thread has it's own Connection instance (which is also required for proper thread read/write, otherwise I've got small errors). After very long fight and struggle to make it reconnect after connection problems I've finally managed to do that. Though the reconnection time isn't quite repeatable but sooner or later it does reconnect properly.

There are two important factors to make it work:

  1. You always need to close connection when You detect an issue with it, which I've managed to do with below code (in run there's logic which tries to read registry). connect() method on controller returns Connection instance already opened, otherwise None:
try:
     reader.run()
 except Exception:
     if self.controller.bk:
         self.controller.bk.close()
     self.controller.bk = self.controller.connect()
  1. The connect() method needs to clear route as described in previous issue: https://github.com/stlehmann/pyads/issues/47 I know that Connection should do that for us, but somehow it doesn't work without it. Below my connect() implementation:
      def connect(self):
          try:
              bk = pyads.Connection(self.ads_id, self.ads_port, self.ip)
              bk.set_timeout(500)
              pyads.delete_route(bk._adr)
              pyads.add_route(bk._adr, self.ip)
              bk.open()
              return bk
          except Exception as e:
              bk.close()
              time.sleep(self.connect_wait)
              return None

This is really the only scenario when script does work correctly and reconnection occurs on every connection issue.

Not sure where the problems are coming from, cause I've got not enough knowledge about ADS C library which pyads uses, not sure if all the problems doesn't come from that the processes/connections in ADS library has some specific lifetime. Also not sure how AMS router works in pyads in Linux, from what I've seen it's handled different then windows one (which uses Twincat installed router), so for linux it has some internal one, which is implemented in ADS C library? Is this router common for entire library and that is why it needs clearing with delete_route and add_route?

If I can help in any matter or provide more feedback please let me know, I'll be more then happy to do so.

stlehmann commented 4 years ago

so for linux it has some internal one, which is implemented in ADS C library?

Yes, on Linux the routing is handled by adslib which is developed and maintained separately from TwinCAT by Beckhoff: https://github.com/Beckhoff/ADS