S7NetPlus / s7netplus

S7.NET+ -- A .NET library to connect to Siemens Step7 devices
MIT License
1.3k stars 578 forks source link

Guarding against multiple requests at the same time. #295

Closed scamille closed 3 years ago

scamille commented 4 years ago

289 started a small discussion about having multiple requests at the same time with S7.NetPlus causes problems because its design is inherently sequential.

@jakob-ledermann

There is a second bigger issue, which is also in my opinion of topic for now: I managed to send a second request to the plc before the first one was answered by the plc.

My 2 cents:

As far as I understand it, the S7 protocol would allow sending a "identifier" with each request, and you can expect to get that identifier back in the response from the PLC.

S7Net.Plus just always sends the same identifier (0 I believe) and does not check anything. We could check that, further increasing robustness.

But one fundamental limitation of the library certainly is that for the single socket we have per PLC, there can only be 1 request active during any time. Having multiple requests would require a fundemantally different architecture, and make things way way too complicated.

The appropriate solution for S7.NetPlus would be to either add some locks or at least be informative about it being neither multi-thread-safe nor truly suitable for unlocked async operations.

I personally have async locks available for the software where I integrated S7.NetPlus library, so did not encounter any of the problems. But I expect that specifically for users of the async api, this could lead to some confusion.

mycroes commented 4 years ago

I intent to add a threadsafe version that will lock internally. The disadvantage of async locks around the calls to S7NetPlus is that you also lock around message building and result parsing, while internal locking would only need to take care of the stream Read/Write. I do think related improvements should be made as well though, but either way it's something I plan to fix one way or the other.

scamille commented 4 years ago

I intent to add a threadsafe version that will lock internally. The disadvantage of async locks around the calls to S7NetPlus is that you also lock around message building and result parsing, while internal locking would only need to take care of the stream Read/Write. I do think related improvements should be made as well though, but either way it's something I plan to fix one way or the other.

I don't think internal locking will give you that big of a benefit. You certainly needs to put a single lock around each pair of request you send to the plc and the associated reponse. So except for large (>maxPDUSize), multi read/writes, there is probably not much you can do here instead of locking the whole thing.

Building request and parsing can be exluded sure, but that usually is really negligible anyway compared to the overall time required for any request over the network.

Jason-Jelks commented 3 years ago

It sounds like you are opening and communicating via a single channel connection. I certainly experienced this issue because we use the Class.cs function exclusively and are reading/writing blocks of 3000+ bytes at a time which can take a little bit longer to complete. I got around this by opening multiple channels to the PLC (max of simultaneous 31 channels, tested on S7-1200).

If the client opening multiple channels is not an option and this needs to be addressed in the S7.NET project, then I would not suggest locking as you would block the other channels (tested this option already).

Why not setup a ConcurentQueue of the write request. The only problem you would have is that if you receive multiple writes to the same structure/location, do they require that all updates get processed or only the last value written. Typically in manufacturing it is the last value to write, although for some SCADA/Historian applications they need all the values for trending. This is what I wound up building into my client so that I don't 'over-feed', plus this allowed me to reduce the number of channels I had to open.

Jason-Jelks commented 3 years ago

As an addendum to my comment, I have not been able to open the Unit Test associated with the S7.NET project, so all of the testing I have been doing is with a live S7-1200.

VerbovenNick commented 3 years ago

It sounds like you are opening and communicating via a single channel connection. I certainly experienced this issue because we use the Class.cs function exclusively and are reading/writing blocks of 3000+ bytes at a time which can take a little bit longer to complete. I got around this by opening multiple channels to the PLC (max of simultaneous 31 channels, tested on S7-1200).

If the client opening multiple channels is not an option and this needs to be addressed in the S7.NET project, then I would not suggest locking as you would block the other channels (tested this option already).

Why not setup a ConcurentQueue of the write request. The only problem you would have is that if you receive multiple writes to the same structure/location, do they require that all updates get processed or only the last value written. Typically in manufacturing it is the last value to write, although for some SCADA/Historian applications they need all the values for trending. This is what I wound up building into my client so that I don't 'over-feed', plus this allowed me to reduce the number of channels I had to open.

Do you mean with "multiple channels" that you are constructing multiple PLC instances //Plc(CpuType cpu, string ip, Int16 rack, Int16 slot// in your S7 .net application? I am countering the same issue, it could happen that more than one Writeclass or Readclass are initiated at the same time. Thank you I wonder how that you tested it?

Jason-Jelks commented 3 years ago

Do you mean with "multiple channels" that you are constructing multiple PLC instances //Plc(CpuType cpu, string ip, Int16 rack, Int16 slot// in your S7 .net application? I am countering the same issue, it could happen that more than one Writeclass or Readclass are initiated at the same time. Thank you I wonder how that you tested it?

Regarding the 'Class' R/W specifically; This all depends on which Siemens PLC you are connecting to of course. The S7-1200 can accept up to 32 simultaneous client connections that are all reading and writing simultaneously on different threads (I tested and verified this), although the performance is terrible if you max out the PLC's connections. The S7-1500 can accept a lot more. I have found that I can have a high speed reader thread polling constantly with no 'Sleep()' for a specific DB, and multiple reader/writer threads that are retrieving/feeding data at various intervals or On-Demand.

Simulation testing only goes so far. We test and verify directly on our inhouse PLC's, as that gives the most accurate feedback for endurance testing. In the hypothetical circumstance there were 2 connections writing to the same DB/Struct at the same time during the same scan, then the last write would be the one to persist. I have seen that some applications implementing S7.NET open and close the connection with each R/W request. We must persist our connections because it takes too much time to open and close the connection when we are trying to get NRT motion encoder information for tracking and guidance (which is what the reader thread mentioned above is for).

We are implementing this using a gRPC bi-directional process that takes a protobuf into a Class and injects it into the ConcurrentQueue for the process thread to assign to a reader/writer process.

One of the critical aspects for dealing with the Class.cs is that it is missing some dependency injection to control what types are valid to which PLC. scamile did a great job with the DTL (DateTimeLong) format. However, it is only valid for S7-1200 & S7-1500 and not supported on older models, so the onus is on the implementing developer to understand which type they need and build the class using the S7.NET types instead of standard .NET DataTypes, but then you would have to use a parser to convert System.DateTime to S7.NET.Types.DateTime[Long] which defeats the purpose of using standard .NET Classes. For Additional information, the DT format is redacted in the S7-1200, and was only recently brought back into the S7-1500. I have no confirmation from Siemens Engineering if the S7-1200 will be retro-fitted with the DT format, but indications are 'NO'. If you have access to Siemens Support then you can see their documentation on which PLC models support which data types, although I will admit I don't like sifting through all their documentation. Unfortunately, I don't have a lot of time to devote to this project... yet! However once I have completed our changes and can verify with both Unit Test and actual endurance test, I will submit my changes mycroes review (but the earliest that will be is 1Q21).

VerbovenNick commented 3 years ago

HiThank you for the detailed answer!I have application with s7-300 plc.Do you mean with 2 connections, 2 plc instances in your .net application?ThxVerzonden vanaf mijn Galaxy -------- Oorspronkelijk bericht --------Van: Jason Jelks notifications@github.com Datum: 13/11/20 03:25 (GMT+01:00) Aan: S7NetPlus/s7netplus s7netplus@noreply.github.com Cc: VerbovenNick nverboven@r2d-engineering.be, Comment comment@noreply.github.com Onderwerp: Re: [S7NetPlus/s7netplus] Guarding against multiple requests at the same time. (#295) Regarding the 'Class' R/W specifically; This all depends on which Siemens PLC you are connecting to of course. The S7-1200 can accept up to 32 simultaneous client connections that are all reading and writing simultaneously on different threads (I tested and verified this), although the performance is terrible if you max out the PLC's connections. The S7-1500 can accept a lot more. I have found that I can have a high speed reader thread polling constantly with no 'Sleep()' for a specific DB, and multiple reader/writer threads that are retrieving/feeding data at various intervals or On-Demand. Simulation testing only goes so far. We test and verify directly on our inhouse PLC's, as that gives the most accurate feedback for endurance testing. In the hypothetical circumstance there were 2 connections writing to the same DB/Struct at the same time during the same scan, then the last write would be the one to persist. I have seen that some applications implementing S7.NET open and close the connection with each R/W request. We must persist our connections because it takes too much time to open and close the connection when we are trying to get NRT motion encoder information for tracking and guidance (which is what the reader thread mentioned above is for). We are implementing this using a gRPC bi-directional process that takes a protobuf into a Class and injects it into the ConcurrentQueue for the process thread to assign to a reader/writer process. One of the critical aspects for dealing with the Class.cs is that it is missing some dependency injection to control what types are valid to which PLC. scamile did a great job with the DTL (DateTimeLong) format. However, it is only valid for S7-1200 & S7-1500 and not supported on older models, so the onus is on the implementing developer to understand which type they need and build the class using the S7.NET types instead of standard .NET DataTypes, but then you would have to use a parser to convert System.DateTime to S7.NET.Types.DateTime[Long] which defeats the purpose of using standard .NET Classes. For Additional information, the DT format is redacted in the S7-1200, and was only recently brought back into the S7-1500. I have no confirmation from Siemens Engineering if the S7-1200 will be retro-fitted with the DT format, but indications are 'NO'. If you have access to Siemens Support then you can see their documentation on which PLC models support which data types, although I will admit I don't like sifting through all their documentation. Unfortunately, I don't have a lot of time to devote to this project... yet! However once I have completed our changes and can verify with both Unit Test and actual endurance test, I will submit my changes mycroes review (but the earliest that will be is 1Q21).

—You are receiving this because you commented.Reply to this email directly, view it on GitHub, or unsubscribe.

Jason-Jelks commented 3 years ago

I have application with s7-300 plc.Do you mean with 2 connections, 2 plc instances in your .net application?

Yes that is correct. You will have to govern when the instance you are going to perform your read/write is actively performing another request, as multiple read/write request into the same instance at the same time could be disastrous.

I will take a moment to also indicate that the S7.NetPlus does not, and cannot, perform validation to ensure that the TYPE you are writing to matches. Only that you have not exceeded the limit of the structure (DB) into which you are writing. This was discovered quite by accident when there was an unexpected change in the DB/struct that was not 'mirrored' in the .NET Class which was updating to the DB. The end result was that none of the values in the DB tags for the range of bytes that were updated made sense. So when you are using the Class read/write, make sure YOU are in control of the Struct(UDT)/DB which is being written to!

mycroes commented 3 years ago

I think this is now solved with the release of v0.12.0, thus closing the issue. Don't hesitate to respond if there are still questions.

jakob-ledermann commented 3 years ago

I agree. AFAIK in commit 5318f94dd71b960c0771c4bb57615e3b42e7ee09 (tagged as v0.12.0) it should no longer be possible to trigger to original issue.

The TaskQueue is a very interesting implementation. A minor change I would have included would be the removal of StreamExtensions.ReadExact and all callers (as far as I see TPDU.Read in COTP.cs is unused and not intended as public API for library users.)

The synchronous API still remains with a potential for a deadlock in the communication (the asynchronous API also if no cancellationTokens are used). But deadlocks are out of scope for this issue.

mycroes commented 3 years ago

Hi @jakob-ledermann,

I agree. AFAIK in commit 5318f94 (tagged as v0.12.0) it should no longer be possible to trigger to original issue.

The TaskQueue is a very interesting implementation.

Thank you, it's actually the first time I'm writing it like this, because I used to do something similar using Task.ContinueWith(...), however that requires locking while this can use Interlocked.Exchange(...). I guess this will work fine though. I actually wanted to use Nito.AsyncEx, but it wasn't available for all S7NetPlus target platforms. Either way, from what I saw at a quick glance it's also using TaskCompletionSources for it's AsyncLock, so I guess this should do just fine.

A minor change I would have included would be the removal of StreamExtensions.ReadExact and all callers (as far as I see TPDU.Read in COTP.cs is unused and not intended as public API for library users.)

I thought it was still in use, but yes, that's definitely one to clean up as well.

The synchronous API still remains with a potential for a deadlock in the communication (the asynchronous API also if no cancellationTokens are used). But deadlocks are out of scope for this issue.

Not sure how you think it can deadlock? A deadlock would happen when something queued on the TaskQueue would await taskQueue.Enqueue(...) because the latter can never start. The communication can still stall if no data can be read, however AFAIK the TcpClient.ReceiveTimeout (set by PLC.ReadTimeout) should take care of that. It's not set by default though...

Regards, Michael

jakob-ledermann commented 3 years ago

Thank you, it's actually the first time I'm writing it like this, because I used to do something similar using Task.ContinueWith(...), however that requires locking while this can use Interlocked.Exchange(...). I guess this will work fine though. I actually wanted to use Nito.AsyncEx, but it wasn't available for all S7NetPlus target platforms. Either way, from what I saw at a quick glance it's also using TaskCompletionSources for it's AsyncLock, so I guess this should do just fine.

A minor change I would have included would be the removal of StreamExtensions.ReadExact and all callers (as far as I see TPDU.Read in COTP.cs is unused and not intended as public API for library users.)

I thought it was still in use, but yes, that's definitely one to clean up as well.

Sorry, yes TPDU.Read(Stream) is still used by TSDU.Read(Stream), which in turn is unused.

The synchronous API still remains with a potential for a deadlock in the communication (the asynchronous API also if no cancellationTokens are used). But deadlocks are out of scope for this issue.

Not sure how you think it can deadlock? A deadlock would happen when something queued on the TaskQueue would await taskQueue.Enqueue(...) because the latter can never start. The communication can still stall if no data can be read, however AFAIK the TcpClient.ReceiveTimeout (set by PLC.ReadTimeout) should take care of that. It's not set by default though...

Maybe deadlock is not the best word (as it is normally used with blocking threads). I mean it is possible all communication to the PLC is unable to make any progress (and therefore send its request), when a Request remains unanswered. I am not sure if the ReadTimeout on the NetworkStream is respected by the async operations, as I would expect them to map to the IO Completion Port API of the windows socket (which may be wrong and is not possible on .Net 5 or .Net Core Runtimes on Linux). But again no timeout by default means a naive user might wait indefinitely. As the recommended way is to use the Async API this could well be covered with documentation and maybe even by marking the synchronous API as deprecated and obsolete. Such a missing answer could be triggered by a well timed disconnect of a networkcable behind at least one switch from the PC as an idle TCP connection (nothing sending) is considered alive and good for quite a long period (default probing interval of TCP Keepalive is 2 hours). As any subsequent request is only allowed on the wire after the response to the previous request has been seen, this can only be truly solved using timeouts. But as already stated, I consider the above as offtopic for this issue.

mycroes commented 3 years ago

Cleanup in #398.