ottowayi / pycomm3

A Python Ethernet/IP library for communicating with Allen-Bradley PLCs.
MIT License
397 stars 88 forks source link

[Bug] socket timeout error #168

Closed nunchuckBoP closed 2 years ago

nunchuckBoP commented 3 years ago

Hello, I am getting a socket timeout error on the latest version of pycomm3. When I write individual tags, they work, but are very slow. I am connecting to version 33 controllogix PLC. The exception that occurs is this below. I've attached my entire terminal session so you can see what I was doing. I did try to extend LogixDriver._cfg['Timeout'] to 60 (seconds?) but that did not seem to help.

d.write(*tags) Traceback (most recent call last): File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/socket_.py", line 65, in receive data = self.sock.recv(256) socket.timeout: timed out

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/cip_driver.py", line 605, in _receive reply = self.sock.receive() File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/socket.py", line 72, in receive raise CommError("socket connection broken") from err pycomm3.exceptions.CommError: socket connection broken

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "", line 1, in File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/cip_driver.py", line 81, in wrapped return func(self, *args, **kwargs) File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/logix_driver.py", line 1123, in write write_results = self._send_requests(requests) File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/logix_driver.py", line 1420, in _send_requests response = self.send(request) File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/logix_driver.py", line 1466, in send return super().send(request) File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/cip_driver.py", line 588, in send reply = None if request.no_response else self._receive() File "/home/neech/venv/lib/python3.8/site-packages/pycomm3/cip_driver.py", line 607, in _receive raise CommError("failed to receive reply") from err pycomm3.exceptions.CommError: failed to receive reply PythonLog.txt

ottowayi commented 3 years ago

Sorry for the late reply, that error looks like the PLC is closing the connection on it's end. There is a timeout value, but it doesn't come from that _cfg variable. It's hard-coded (for now, I plan on adding more advanced config options) as part of the request class. But, it is set to a fairly high number, so as long as you do something every 30s or so the connection will remain open with the PLC. The driver itself does not use a heartbeat to maintain an open connection, so if you go long enough without doing a read/write the PLC may close the connection on it's end. I can't tell how long it was between when you did the individual writes vs all at once, but if you waited awhile between them then the connection may have timed out. If that isn't the case, I would look at the CPU utilization and number of CIP on the comms card you're connecting to. It's possible that the card is running near capacity and doing that many requests at once overloaded it. Like if you're using an ENBT it will be much slower and can't handle near as many connections as an EN2T. Sorry I don't have a definitive answer right now. If you can keep reproducing the problem, configuring the built in logging (shown here) might be able to give me some more information about what is happening.

nunchuckBoP commented 2 years ago

Ok, thanks for getting back with me and thanks for your help. I was able to write a test program and reproduce the error. Also, I've attached the output (main.log), all the code, and a screenshot of the EN2T card diagnostics. I hope this helps. I have not had this trouble in version 32 of logix pycomm3_tests.zip . main.log Screenshot_20211004_132725

ottowayi commented 2 years ago

I looked thru the logs and I think I found an issue. You're connecting thru an ENBT which does not support the larger packets that EN2T and newer support, ~4000 bytes vs 500 bytes. And it looks like some of the UDT/AOI definitions are larger than 500 bytes so it fails to upload them. So when trying to write the data it doesn't have the appropriate structure information to create the request. I'm not sure why that causes the PLC to close the connection. It could be there is a bug that is generating a garbage request or something. It's going to take a bit more investigating. I'm probably going to have to change the tag upload to use fragmented requests so that if the definition doesn't fit into a single packet it will be able to get it in multiple chunks. Either way, you should be getting an error from the driver before you get that socket error. This may take some time for me to investigate and come up with a fix tho, but I will be working on it and let you know what I get figured out.

In the meantime, can you confirm that some of the tags you're trying to write to are part of a UDT/AOI that is larger than 500 bytes? That will help me confirm I'm looking in the right area.

nunchuckBoP commented 2 years ago

I confirmed that the UDT's that these tags are a part of are LESS than 500 bytes. The "Routing" UDT I believe is the only UDT over 500 bytes, and I am not writing to these tags. I can write to these tags if I loop through them individually and write them; without error. I commented that portion out of main.py in an effort to keep the logs a little shorter.

Anyway, I very much appreciate your help.

nunchuckBoP commented 2 years ago

I suppose the whole block of tags would be well over 500 bytes? Not sure if that makes a difference, as I know the library is supposed to chuck those up before it writes them to the controller.

ottowayi commented 2 years ago

ohh okay, then something else must be going on. There might be a bug in how it tracks the request size, because yeah you should be able to write any number of tags at once and the driver should split them into a number of requests. For now, you could try chunking the writes before feeding them to write, so you can save some time instead of individual writes.

ottowayi commented 2 years ago

I haven't found anything too definitive so far except that my packet size estimation is off by 2 bytes, so if your message was estimated exactly 1 byte below the max the actual message would end up 1 byte over. I didn't think the PLC would just close the connection at that point tho and we'd get some sort of error response.

Would you mind trying out something for me though?
In pycomm3/const.py change MULTISERVICE_READ_OVERHEAD to 10 (or 11 if that doesn't work)? That variable is poorly named, it's used for both read and writes. If that fixes the issue we know it's just the message size exceeds the connection size and the fix is pretty simple. I'm working on a better way for building the messages so it'll no longer be an estimated size but a known size anyway.

nunchuckBoP commented 2 years ago

ottowayi, I did what you suggested in a virtual environment and it looks like it worked. Will you be modifying this variable on an upcoming release?

ottowayi commented 2 years ago

Great! Yeah I'll have that fix in the next release. I should have one out soon, there are a couple other minor issues that I need to look into first, but shouldn't be too long.

nunchuckBoP commented 2 years ago

I set the value to 10, by the way.

ottowayi commented 2 years ago

fixed in v1.2.2