danomagnum / gologix

Ethernet/IP client library for Go inspired by pylogix that aims to be easy to use. Supports being a client and a class 3 / class 1 server.
MIT License
32 stars 8 forks source link

Issue on Header when using ReadMulti() #11

Closed nicola1695 closed 5 months ago

nicola1695 commented 8 months ago

HI, I'm new to the Ethernet/IP protocol and thanks to this library I'm trying to develop a driver that reads from a PLC. I don't have the PLC yet, so I used the Class 3 Server example and set up some very simple tags of type Int32 and Int16.

func main() {

    r := gologix.PathRouter{}

    p1 := gologix.MapTagProvider{}
    path1, err := gologix.ParsePath("1,0")
    if err != nil {
        log.Printf("problem parsing path. %v", err)
        os.Exit(1)
    }
    r.Handle(path1.Bytes(), &p1)

    p1.TagWrite("TestDint", int32(12))
    p1.TagWrite("TestInt", int16(3))

    s := gologix.NewServer(&r)

    go s.Serve()

    t := time.NewTicker(time.Second * 5)
    for {
        <-t.C
        p1.Mutex.Lock()
        log.Printf("Data 1: %v", p1.Data)
        p1.Mutex.Unlock()

    }

}

With another script (Client) I tried to read/write a single tag successfully following the SimpleRead and Write in the examples folder: (I omit the client creation, connection and disconnection part)

var tag1 int32
err = client.Read("TestDint", &tag1)
    if err != nil {
        log.Printf("error reading TestDint. %v", err)
    }
    // do whatever you want with the value
    log.Printf("tag1 has value %d", tag1)

so I can read well. If I want to write, I can do it successfully too because I can see its change on the server side:

var tag1 int32
err = client.Write("TestDint", tag1)
    if err != nil {
        log.Printf("Error writing TestDint. Error: %v", err)

    }

However, if I want to use the ReadMulti() function (following the MultiRead in the examples folder) I get an error both on the client and server side:

var mr multiread

err = client.ReadMulti(&mr)
    if err != nil {
        log.Printf"error reading multiread. %v", err)
            } else {

                log.Printf"multiread struct has values %+v", mr)
            }
       }

Error on the Client side:

problem in read list: problem reading header from socket: EOF: < nil >

Looking in the code, I saw that this is generated by the recv_data() function in the sendreceive.go file.

On the Server side, however, I read two LOGs:

  • Trying to read from MapTagProvider
  • Error on connection 192.168.1.153:44188. problem with sendunitdata problem handling multi service. problem getting data from provider. tag not in map.

I understood from the LOGs that in fact the name of the tag I want to read is empty and there is a blank space. Do you have any idea why this happens?

Thanks for your great work.

danomagnum commented 8 months ago

How is your multiread struct defined, does it match the example?

I'll take a look this evening and see what I can see.

nicola1695 commented 8 months ago

You're right, I forgot to mention it. I defined the multiread struct as in your example:

type multiread struct {
        TestInt  int16 `gologix:"TestInt"`
        TestDint int32 `gologix:"TestDint"`
    }

Thank you.

danomagnum commented 8 months ago

So it looks like the multiple service message wasn't working because it was actually calling the single read behind the scenes for some reason. I went ahead and got a rudimentary implementation going and it looks like it is working for multi-reads and multi-writes. Give Beta 18 a try and let me know how it goes.

The server code doesn't get as much love as the client and I know that in its current state it won't work for reading/writing strings for example.

The built-in MapTagProvider type also only provides rudimentary tag access - It doesn't support addressing arrays directly

It interprets "testtag[3]" as a tag with a literal "[3]" as a string at the end of the map key. Same thing for nested UDT tags - it interprets the dots as literals in the map keys. So if you need those for testing you'll have to kind of fake out the tag mapping on the server end by creating map entries with the full tag access paths as the key.

nicola1695 commented 8 months ago

It worked! The only problem I found was that the server doesn't recognize the capitalization of the tags, so I had to rename the server-side tags to "testint" and "testdint". On the client side, however, I left the capital letters "TestInt" and "TestDint" and I see that on the server it reads "testint" and "testdint". I don't know if it's the client that makes a mistake in formatting the letters in the packet it sends or the server that reads them incorrectly. I also saw that in the multiread struct the first letter of the keys must necessarily be a capital letter, is this correct? Thanks again!

Edit: I just realized that even for the single read capitalization is not recognized. For example, if I have a tag called "Testtag1" on both the server and the client, in the server logs I read:

  • Trying to read testtag1 from MapTagProvider
    • Error on connection 192.168.1.153:51566. problem with sendunitdata problem handling non-frag read. problem getting data from provider. tag testtag1 not in map
danomagnum commented 8 months ago

So the actual PLCs aren't case-sensitive. You can read TeStTaG or testtag or TESTTAG and it will give you the same value assuming that tag exists. Because of this the client sets the tag name to lowercase before requesting it (https://github.com/danomagnum/gologix/blob/e76e9d5b2872c56d7374f8c2b55136f34e4b635a/ioi.go#L117C71-L117C71) . That was a design decision but probably not necessary.

The TagWrite and TagRead functions of the MapTagProvider could lowercase the values when inserting/retrieving them from the map so that it will work more like a controller does. I'll try to get that change in this evening.

For the struct requiring capitals for the first letter that is correct - the fields have to be exported and that means they have to be capital. The same requirement exists for the json module for the same reason (see https://stackoverflow.com/questions/11126793/json-and-dealing-with-unexported-fields).

nicola1695 commented 8 months ago

I'm new to the Ethernet/IP protocol and the Go language, so thank you for your support and your very useful library! I hope this discussion can help other people too.

nicola1695 commented 7 months ago

Hi, I would like to know if the limitations you described also apply to ReadMulti(). For example, is this function capable of reading Bools present in UDTs? Can I read entire arrays or individual elements from them? Thank you

danomagnum commented 7 months ago

I would expect reading a bool inside a UDT to work properly. Same for individual array elements. Any atomic type you specify in the tag on the struct field should read just fine.

I'd have to test reading a full array inside a ReadMulti. I think it would work, but the read function reflects on the slice size to know it needs to request multiple elements so you'd have to set up your struct properly. I'd have to test it to be sure.

nicola1695 commented 7 months ago

So if I want to read an int32 array, can I declare a struct like this?

type multiread struc{

                // other tags
        ArrayTag [ ]int32 `gologix:"arraytag"`

}
danomagnum commented 7 months ago

I checked and right now to read multiple elements of an array, you have to do a dedicated single read (see the ReadArray example).

There isn't a reason you shouldn't be able to do it through multiread, but the way it is set up right now it always assumes an element count of 1.

I'll add that as a bug.

Now, if you've only got a small array you could add the elements individually as a stopgap.

type x struct{ E0 int32 gologix:"ArrayTag[0]" E1 int32 gologix:"ArrayTag[1]" E2 int32 gologix:"ArrayTag[2]" OtherTag int16 gologix:"SomeOtherTag" }

nicola1695 commented 7 months ago

I have a problem with using the struct to read. I have many tags of different types to read and I can't define a specific struct by hand. I can make a map[string]string with a "TagName: Type" pair. Do you have any idea how I can use ReadMulti() in this condition?

danomagnum commented 7 months ago

So if you look at the code behind ReadMulti - it builds a slice of tag names and a slice of types and then sends that to ReadList(tags, types) to do the actual read. If you can make two slices instead of a map you should be able to call that instead. The only tricky thing there is that it returns a slice of []any so you'll have to cast the types back to their proper go types.

See the implemenation of ReadMulti here: https://github.com/danomagnum/gologix/blob/c2c289f9ae3de8e639e6229bdb9c018af9fcce60/read.go#L495

nicola1695 commented 7 months ago

I did it with this custom function in my project:

func ReadMultiFromMap(client *gologix.Client, tagMap map[string]string) (map[string]interface{}, error) {

    // Build the tag list from the map
    tags := make([]string, 0)
    types := make([]gologix.CIPType, 0)

    for tag, typ := range tagMap {

        tags = append(tags, tag)

        ct := gologix.GoVarToCIPType(rType(typ))
        types = append(types, ct)

    }
        // Print for Debug
    fmt.Printf("Tags = %v\n", tags)
    fmt.Printf("Types = %v\n", types)
    // Generate IOIs for each tag
    resultValues, err := client.ReadList(tags, types)
    if err != nil {
        return nil, fmt.Errorf("problem in read list: %w", err)
    }
        // Print result values for Debug
    fmt.Printf("Types = %v\n", resultValues)
    // Unpack the result values into the map
    resultMap := make(map[string]interface{})

    for i, tag := range tags {

        val := resultValues[i]
        resultMap[tag] = val
    }

    return resultMap, nil
}

where the rType function return an initialized variable of the type descripted in the string. In this way the function can return a map with mixed type. It seems to work fine.Tell me what you think about it. Thank you.

danomagnum commented 7 months ago

Yep, that looks right to me.

I may add a native ReadMap function to the library as this seems pretty convenient - If so it will probably take a map[string]any or map[string]CipType. The former will update the values of the map, while the later would return a new map like you've done. I'll have to think about which is a better solution.

danomagnum commented 7 months ago

Beta 19 has some API changes (that's why we're still in Beta after all), but it adds array reads to multi reads by map with a new ReadMap function and multi reads by struct with the original MultiRead function.

nicola1695 commented 7 months ago

I saw your update to Beta 19. I probably noticed an oversight in the MapRead example, because I think that the ReadMap() function should actually be called in line 43, is that correct?

https://github.com/danomagnum/gologix/blob/521ce12ea5fcc97fda6e9e9082e8037fca5b93a8/examples/MapRead/main.go#L35-L46

Ultimately, is it now possible to read arrays of multiple elements?

Thank you

danomagnum commented 7 months ago

Yes, you should be able to read multi-element arrays while reading other tags.

The ReadMulti function checks if you passed it a map[string]any and if so, it calls MapRead instead of doing the struct parsing - so it will do the right thing depending on what you pass it.

nicola1695 commented 7 months ago

So, if we have a struct we pass "&m" as the argument of ReadMulti, while if it is a map we pass "m" ?

// m is a struct
client.ReadMulti(&m) 

// m is a map[string]any
client.ReadMulti(m) 
danomagnum commented 7 months ago

Yes. Maps and slices are kind of pointers internally so you can often pass them without a pointer and they can be updated by the called function.

nicola1695 commented 7 months ago

I have some questions about this library: does this one use implicit or explicit messages? If I want to read tags with ReadMulti() at a speed of 100 ms can I keep the default RPI unchanged?

danomagnum commented 7 months ago

The client uses class 3 / explicit messages. It will try to read them as quickly as you call the function. Note that there is a lock on the socket so only one message ever goes at a time and another is not sent until the controller replies. The maximum rate you read the tags is determined by PLC response time. Depending on the controller model and network latency, 30-50 ms per message is typical with a message holding maybe 200 tags. This benchmarking was done on an old low-end controller so I would expect a new one to perform better.

If you want implicit messages you can set up a class 1 server. (see Server_Class1 for an example). Then you'd add the computer running the server to the IO tree on the controller as a generic "ethernet bridge" and add generic ethernet modules to that bridge that correspond to the paths you set up when defining the controller (see the images in that example folder). This lets you have multiple "io connections" between the controller and the computer. You can do up to 500 bytes per "slot". In this case you can set your RPI for whatever your network will suffer - these are UDP packets instead of TCP so at some point you'll start dropping packets and after some number in a row you'll get connection failures. Generally speaking 10ms and above should probably work without issue.

nicola1695 commented 7 months ago

So if I understand correctly the default RPI value of 2.5s is ignored, am I right? About the message containing 200 tags, is this approximately the maximum number of tags I can read with the ReadMulti() function? If I have more than 200 tags (e.g. 1000 tags) I thought of executing ReadMulti() in a for loop for blocks of 200 tags, but I don't know if the function is already capable of splitting the request into multiple messages. Thank you.

danomagnum commented 7 months ago

There isn't an RPI value at all with the explicit messages. There is a timeout setting - is this what you mean?

(Well there kind of is an RPI, but it's not obvious what use it has. If you set up a "connected" class 3 msg in the controller to a server and assign an RPI it will re-send the data over and over again at that same rate. It will send the exact same data though, even if the values change in the controller. I guess to keep the connection alive? Either way, the client in this library doesn't do anything like that.)

The ReadMulti function should chunk the data up into messages based on the negotiated forward open size with the controller, tagname lengths, and resultant data sizes. Now that you mention it I should double check that it still does that correctly now that it can read array types. (If it assumes a dint type only takes 4 bytes in the response but you want 100 of them, that means the calculated response size will be off by 396 bytes. Often the tag names end up being the limiting factor since they are quite a bit larger than the responses so this may still be OK in most scenarios though)

nicola1695 commented 7 months ago

I'm referring to this RPI value of 2.5 seconds: https://github.com/danomagnum/gologix/blob/521ce12ea5fcc97fda6e9e9082e8037fca5b93a8/client.go#L85-L102

About the ReadMulti() function, so is it useless I split the request for many tags into multiple blocks? (E.g. I want to read 1000 tags and I put the ReadMulti() function in a for loop reading 200 tags for each iteration)

danomagnum commented 7 months ago

The RPI is sent to the PLC in the opening handshake. I believe it is used on the PLC side as a timeout. See this: https://github.com/danomagnum/gologix/blob/master/docs/connectionTimeouts.csv

The ReadMulti function should split a single read of 1000 tags into multiple messages and re-assemble the responses. Here is the logic where it loads as many tags as will fit into each message and loops until it has read them all: https://github.com/danomagnum/gologix/blob/521ce12ea5fcc97fda6e9e9082e8037fca5b93a8/partialread.go#L33C2-L33C2

I just made a change that should make it take array size into account ( https://github.com/danomagnum/gologix/commit/10433bdf425f469a53c61fbbe92ebb2ab209f9eb ). I will run the unit tests on everything this evening and if it works I'll publish a new release with this change.

nicola1695 commented 7 months ago

I've seen that if the tag list is a map, the ReadMap() function is called. https://github.com/danomagnum/gologix/blob/521ce12ea5fcc97fda6e9e9082e8037fca5b93a8/read.go#L498 However, this function does not have the logic where it splits the message because it does not call ReadList(). Anyway, I think the solution might be to use the function I had written here. What do you think about it? https://github.com/danomagnum/gologix/issues/11#issuecomment-1806327935

Anyway, I'm writing to you about a different problem I'm having. As I told you, I am simulating the PLC with the class 3 server in which I have inserted various very simple tags. I use the client to read tags in an infinite loop through the ReadMulti() function. The operations I do each loop are (1) create the client, (2) connect, (3) read with ReadMulti(), and (4) disconnect. I need this because I need to always know the latest tag data. But the problem I encounter is that after a few readings the server no longer responds and so the connecting function waits about 2 minutes and 5 seconds (yes minutes) before giving me an error on the client side which is typically:

Error opening client: dial tcp 10.17.70.143:44818: connect: connection timed out

On the server side I noticed that after every disconnection there is always an error after the forward close, I don't know if this could be a reason.

Note: When this happens, I can run another script and read/write to the same server correctly.

Here I copy some server logs of my tests when I try to Read 10 tags:

2023/11/20 17:11:22 new connection from 10.7.194.64:51012 2023/11/20 17:11:22 context: 0 2023/11/20 17:11:22 register message: {ProtocolVersion:1 OptionFlag:0} 2023/11/20 17:11:22 context: 0 2023/11/20 17:11:22 hee: 0. timeout: 0 2023/11/20 17:11:22 items: [{Header:{ID:0 Length:0} Data:[] Pos:0} {Header:{ID:178 Length:52} Data:[91 2 32 6 36 1 10 14 143 100 202 241 144 100 202 241 142 100 118 23 141 100 202 241 3 0 0 0 160 37 38 0 160 15 0 66 160 37 38 0 160 15 0 66 163 3 1 0 32 2 36 1] Pos :0}] 2023/11/20 17:11:22 got large forward open from 10.7.194.64:51012 2023/11/20 17:11:22 forward open msg: {cipService_LargeForwardOpen 2 32 6 36 1 10 14 4056573071 4056573072 25742 6006 4056573069 3 2500000 1107300256 25 00000 1107300256 163 3} @ [1 0 32 2 36 1] 2023/11/20 17:11:22 New Managed Connection. {ID:25742 OT:4056573071 TO:4056573072 RPI:2.5s Path:[1 0] Open:true} 2023/11/20 17:11:22 context: 0 2023/11/20 17:11:22 hee: 0. timeout: 0 2023/11/20 17:11:22 got connection id 4056573071 = &{ID:25742 OT:4056573071 TO:4056573072 RPI:2.5s Path:[1 0] Open:true} 2023/11/20 17:11:22 Trying to read testtag8 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag1 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag10 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag6 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag4 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag5 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag7 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag2 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag3 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 Trying to read testtag9 from MapTagProvider 2023/11/20 17:11:22 not a slice 2023/11/20 17:11:22 send unit data service requested: cipService_MultipleService 2023/11/20 17:11:22 context: 0 2023/11/20 17:11:22 hee: 0. timeout: 0 2023/11/20 17:11:22 items: [{Header:{ID:0 Length:0} Data:[] Pos:0} {Header:{ID:178 Length:24} Data:[78 2 32 6 36 1 10 14 142 100 118 23 141 100 202 241 3 0 1 0 32 2 36 1] Pos:0}] 2023/11/20 17:11:22 got forward close from 10.7.194.64:51012 2023/11/20 17:11:22 Closing connection 25742 2023/11/20 17:11:22 Error on connection 10.7.194.64:51012. problem reading eip header. EOF

nicola1695 commented 7 months ago

EDIT: I've seen now that from your file 2500 seconds should correspond to 80 seconds, it's pretty close to my 65 second timeout, maybe I start counting at the wrong time.

danomagnum commented 7 months ago

I see what you mean about ReadMap not using ReadList. I'll get that fixed this evening when I'm running the unit tests for the other change. That should let you just call the function and have it work without having to jump through any hoops to split up the reads. I think your function should work though if you want to go that way.

I will have to investigate the server problem. Might not be cleaning up a connection resource or something. Strange though that it would still work from a different program.

I would probably not connect/disconnect every iteration though. I'd recommend connecting once and using that connection indefinitely (or until something causes it to fail). Because the library uses "connected" class 3 messaging (as opposed to unconnected messages) the PLC reserves resources for the connection during the initial handshake. You wouldn't want it to have to free and re-allocate them over and over.

nicola1695 commented 7 months ago

Yes, I think I will do as you said regarding the connection to the PLCs, thanks for your valuable advice. I can count on that if the connection to one of the PLCs drops, it will automatically reconnect on the first read/write request, am I right? If I can help you in any way I will be happy to answer you regarding the problem on the server. Thank you

danomagnum commented 7 months ago

I can count on that if the connection to one of the PLCs drops, it will automatically reconnect on the first read/write request, am I right?

The first read/write after a connection failure will fully close the connection and return an error. If you retry the read/write it will attempt to reconnect. If that connection fails it will return an error and will retry again on the next read/write.

Here is the unit test for it that walks through how it should work. https://github.com/danomagnum/gologix/blob/master/tests/reconnection_test.go

nicola1695 commented 7 months ago

I will make changes to my code to implement this logic. The intent of my program is to always be running (days), so how long should the "connected" state usually last? (although it's not really a problem since it can reconnect)

danomagnum commented 7 months ago

One connection should be able to last indefinitely unless there is a network or power issue.

In a typical production program I've been using the general layout shown in this example I added. https://github.com/danomagnum/gologix/blob/master/examples/ProductionService/main.go

As for the server not accepting connections anymore I wasn't able to reproduce that. This is the code I was using to test against the server in the Examples/Server_Class3

package main

import (
    "log"

    "github.com/danomagnum/gologix"
)

func main() {
    var err error
    for i := 0; i < 1000; i++ {
        client := gologix.NewClient("localhost")

        err = client.Connect()
        if err != nil {
            log.Printf("Error opening client. %v", err)
            return
        }

        m := make(map[string]any)
        m["testtag1"] = int16(0)
        m["testtag2"] = int32(0)
        //m["testtag3"] = make([]int32, 10)
        m["testdint"] = int32(0)
        m["testint"] = int16(0)

        err = client.ReadMulti(m)
        if err != nil {
            log.Printf("error reading testint. %v", err)
        }
        log.Printf("%2d | multiread map has values %+v", i, m)
        client.Disconnect()
    }

}

I had it connecting and disconnecting every loop like you did and didn't have any issues doing a thousand cycles. If you can provide the minimum code to reproduce I will look into it further.

nicola1695 commented 7 months ago

I can also reproduce the error using your code. To best simulate a real situation I launch the server program on another remote PC, with ping performance ranging from 50ms to 200ms (a poor network). However, if I run the server on localhost the error does not occur. Therefore I think the error occurs under poorer network conditions. Also in my server I simulate two PLCs, one in path 1,0 and one in path 1,1 and I have multiple tags. Among these tags I try to read 10 of these. My server is the following:

package main

import (
    "log"
    "os"
    "time"

    "github.com/danomagnum/gologix"
)

func main() {

    ////////////////////////////////////////////////////
    // First we set up the tag providers.
    //
    // Each one will have a path and an object that fulfills the gologix.TagProvider interface
    // We set those up and then pass them to the Router object.
    // here we're using the build in map tag provider which just maps cip reads and writes to/from a go map
    //
    // In theory, it should be easy to make a tag provider that works with a sql database or pumps messages onto a channel
    // or whatever else you might need.
    ////////////////////////////////////////////////////

    r := gologix.PathRouter{}

    // one memory based tag provider at slot 0 on the virtual "backplane"
    // a message to 2,xxx.xxx.xxx.xxx,1,0 (compact logix) will get this provider (where xxx.xxx.xxx.xxx is the ip address)
    // the message path before the IP address in the msg instruction will be different based on the actual controller
    // you're using, but the part after the IP address is what this matches

    p1 := gologix.MapTagProvider{}
    path1, err := gologix.ParsePath("1,0")
    if err != nil {
        log.Printf("problem parsing path. %v", err)
        os.Exit(1)
    }
    r.Handle(path1.Bytes(), &p1)

    // set up some default tags.
    // using TagWrite() and TagRead() are treadsafe if needed.
    // otherwise you can lock p1.Mutex and manipulate p1.Data yourself
    p1.TagWrite("testtag1", int32(12))
    p1.TagWrite("testtag2", float32(543.21))
    p1.TagWrite("testtag3", int32(3))
    p1.TagWrite("testtag4", float32(42.19))
    p1.TagWrite("testtag5", int32(12))
    p1.TagWrite("testtag6", float32(0.87))
    p1.TagWrite("testtag7", int32(12))
    p1.TagWrite("testtag8", float32(237.43))
    p1.TagWrite("testtag9", int32(100))
    p1.TagWrite("testtag10", float32(127.62))
    p1.TagWrite("testdint", int32(12))
    p1.TagWrite("testint", int16(3))
    //a different memory based tag provider at slot 1 on the virtual "backplane" this would be "2,xxx.xxx.xxx.xxx,1,1" in the msg connection path
    p2 := gologix.MapTagProvider{}
    path2, err := gologix.ParsePath("1,1")
    if err != nil {
        log.Printf("problem parsing path. %v", err)
        os.Exit(1)
    }
    r.Handle(path2.Bytes(), &p2)

    p2.TagWrite("testtag3", bool(true))

    s := gologix.NewServer(&r)

    go s.Serve()

    t := time.NewTicker(time.Second * 5)
    for {
        <-t.C
        p1.Mutex.Lock()
        log.Printf("Data 1: %v", p1.Data)
        p1.Mutex.Unlock()

        // p2.Mutex.Lock()
        // log.Printf("Data 2: %v", p2.Data)
        // p2.Mutex.Unlock()

    }

}

Below is the client code to read 10 tags of the first PLC:

package main

import (
    "log"

    "github.com/danomagnum/gologix"
)

func main() {
    var err error
    for i := 0; i < 1000; i++ {
        client := gologix.NewClient("10.17.70.143")
        //client := gologix.NewClient("localhost")
        err = client.Connect()
        if err != nil {
            log.Printf("Error opening client. %v", err)
            return
        }

        m := make(map[string]any)

        m["testtag1"] = int32(0)
        m["testtag2"] = float32(0)
        m["testtag3"] = int32(0)
        m["testtag4"] = float32(0)
        m["testtag5"] = int32(0)
        m["testtag6"] = float32(0)
        m["testtag7"] = int32(0)
        m["testtag8"] = float32(0)
        m["testtag9"] = int32(0)
        m["testtag10"] = float32(0)

        //m["testdint"] = int32(0)
        //m["testint"] = int16(0)

        err = client.ReadMulti(m)
        if err != nil {
            log.Printf("error reading testint. %v", err)

            return
        }
        log.Printf("%2d | multiread map has values %+v", i, m)
        client.Disconnect()
    }

}

after a few iterations (it's random), the reading for freezes for about 2 minutes and then I get the usual error:

Error opening client. dial tcp 10.17.70.143:44818: connect: connection timed out

The only strange thing I can notice is that always after every disconnection on the server there is the same error:

Error on connection 10.30.7.19:49152. problem reading eip header. EOF

Where, the IP 10.30.7.19 is my PC, and the IP address 10.17.70.143 is that of the remote PC where the server is running.

danomagnum commented 7 months ago

Ok. Good info. I'll try to set up a VM when I get as chance and see if I can reproduce it that way.

What is your subnet mask? With those IPs I'm assuming you're going through a router? Is there a firewall?

nicola1695 commented 7 months ago

The subnet mask is 255.255.252.0. The IP addresses you see are special because they are on my corporate network. Both PCs are without firewalls. However there is good news, I changed the logic by only connecting to the beginning once and in the loop I only call the read operation and for now I no longer encounter that problem.

danomagnum commented 7 months ago

So with that subnet mask you're on different networks so you'd have to go through a router to communicate between the nodes. I wonder if the router was dropping the TCP handshake packets or something.

Either way I'm glad it's working now.

nicola1695 commented 7 months ago

I have a connection problem. To test while doing a for loop of infinite reads I try to stop the server execution. As a result I get the reading error as to be expected. Now, if I restart the server after a few seconds, everything works again, the client is able to reconnect. However, if I wait 10-20 seconds to restart the server, then the client is no longer able to reconnect. What I described also happens in localhost. The server code is the same. Here I copy the client code adapted for infinite readings:

package main

import (
    "log"

    "github.com/danomagnum/gologix"

    "os"
    "os/signal"
    "syscall"
)

func main() {
    sig := make(chan os.Signal, 1)

    // Notify 'sig' channel if there is an Interrupt signal
    signal.Notify(sig, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)

    var err error
    var i int
    //client := gologix.NewClient("10.17.70.143")
    client := gologix.NewClient("localhost")
    err = client.Connect()

    if err != nil {
        log.Printf("Error opening client. %v", err)
        //return
    }

    // Disconnect if code is interrupted
    go func() {

        <-sig

        err := client.Disconnect()
        if err != nil {
            log.Panicf("problem disconnecting. %v", err)
        }

        // Exit
        os.Exit(0)
    }()

    for {

        i++
        m := make(map[string]any)

        m["testtag1"] = int32(0)
        m["testtag2"] = float32(0)
        m["testtag3"] = int32(0)
        m["testtag4"] = float32(0)
        m["testtag5"] = int32(0)
        m["testtag6"] = float32(0)
        m["testtag7"] = int32(0)
        m["testtag8"] = float32(0)
        m["testtag9"] = int32(0)
        m["testtag10"] = float32(0)

        err = client.ReadMulti(m)
        if err != nil {

            log.Printf("error reading. %v", err)

        }

        log.Printf("%2d | multiread map has values %+v", i, m)

    }

}

How could I solve it?

danomagnum commented 7 months ago

Just to make sure I understand, you've got the server and client program running locally. Then you stop the server (to simulate a connection failure, etc...) and start it again. Whether the client reconnects and works properly again depends on how long you leave the server process stopped for?

nicola1695 commented 7 months ago

Yes exactly. Now I also tried the test with the remote server (on the other PC with IP 10.17.70.143). In this case if I restart the server after a few seconds, the client is able to connect and read again. If more seconds pass the for loop stops because the read operation freezes and it no longer prints the error:

error reading. could not start multi read: not connected and connect attemp failed: dial tcp 10.17.70.143:44818: connect: connection refused

Without restarting the server, observing the client for a few minutes, every now and then it comes back to printing a handful of errors like the one just reported above.

Now, if I restart the server nothing happens and the client continues to behave that way. Only by waiting a few more minutes can the connection be re-established. So, I think it's as if too many failed read calls freeze the code and then it takes a long time to re-establish the connection once the server is restarted.

On the local server things are worse because, even if the Reads don't block (errors are always printed), the reconnection doesn't start even after waiting a long time.

Thank you for your help!

nicola1695 commented 7 months ago

I changed the code a bit and so it doesn't seem to crash due to too many bad read requests. First of all I put a pause of 100ms between one read and another, secondly I established that if the connection is lost, I wait 5 seconds to call a subsequent read request (which will autoconnect if the server is operational again). This way it seems like there aren't too many errors and when the server starts working again the client is able to connect and read.

package main

import (
    "log"
    "time"

    "github.com/danomagnum/gologix"

    "os"
    "os/signal"
    "syscall"
)

func main() {
    sig := make(chan os.Signal, 1)

    // Notify 'sig' channel if there is an Interrupt signal
    signal.Notify(sig, os.Interrupt, syscall.SIGINT, syscall.SIGTERM)

    var err error
    var i int
    client := gologix.NewClient("10.17.70.143")
    //client := gologix.NewClient("localhost")
    err = client.Connect()

    if err != nil {
        log.Printf("Error opening client. %v", err)
        //return
    }

    // Disconnect if code is interrupted
    go func() {

        <-sig

        err := client.Disconnect()
        if err != nil {
            log.Panicf("problem disconnecting. %v", err)
        }

        // Exit
        os.Exit(0)
    }()

    for {

        i++
        m := make(map[string]any)

        m["testtag1"] = int32(0)
        m["testtag2"] = float32(0)
        m["testtag3"] = int32(0)
        m["testtag4"] = float32(0)
        m["testtag5"] = int32(0)
        m["testtag6"] = float32(0)
        m["testtag7"] = int32(0)
        m["testtag8"] = float32(0)
        m["testtag9"] = int32(0)
        m["testtag10"] = float32(0)

        err = client.ReadMulti(m)
        if err != nil {

            log.Printf("error reading. %v", err)
        }
        time.Sleep(100 * time.Millisecond)
        log.Printf("%2d | multiread map has values %+v", i, m)

               // Wait if connection is lost
        if !client.Connected {

            log.Println("Trying to reconnect...")
            time.Sleep(5 * time.Second)

        }
    }

}
danomagnum commented 7 months ago

While I would recommend the changes with the reconnect delay and timeout between reads, it should still not stop functioning so I will look into it when I get a chance.

I would recommend to use a time.Ticker instead of a time.sleep(100) for the main polling loop. This way you poll every 100ms instead of 100 ms + the time it took to execute the read. (See https://gobyexample.com/tickers ) Basically you put a ticker := time.NewTicker(100 * time.Millisecond) above your loop and then in the loop you replace the time.Sleep(100 * time.Millisecond) with a <- tiker.C

nicola1695 commented 7 months ago

why is it better not to wait for a read to be completed? Is there a risk that this might block execution if too many reads fail? p.s. Please can you make the 19.1-Beta version available where I saw the ReadMap() change that splits the message? Thank you again

danomagnum commented 7 months ago

It will still wait for a read to be completed, but the time the next read starts will be relative to when you started reading instead of when you stop reading. Now, if your reads go over 100ms then it will immediately begin the next read cycle so maybe that's not desirable, but generally I find it better to be on a more fixed cadence.

I think i accidentally used a capital V instead of lowercase in the version tag and that's why it didn't propagate to pkg.go.dev. I changed it but It usually takes a little bit to make it to the go package server. if it isn't on there by tonight I'll do a v0.20.0 to force it to re-update.

nicola1695 commented 7 months ago

Ok, thanks for your information and advice. I'm wondering if there is a way to set a timeout in the client to make it so when it blocks a read/write it exits. I tried setting client.SocketTimeout = 1 * time.Second before to call client.Connect() but I saw that doing so causes the loop to block after first read. Furthermore, on the server there is the following error highlighted in bold in the penultimate log:

2023/11/23 19:57:24 new connection from 127.0.0.1:36990 2023/11/23 19:57:24 context: 0 2023/11/23 19:57:24 register message: {ProtocolVersion:1 OptionFlag:0} 2023/11/23 19:57:24 context: 0 2023/11/23 19:57:24 ih: 0. timeout: 0 2023/11/23 19:57:24 items: [{Header:{ID:0 Length:0} Data:[] Pos:0} {Header:{ID:178 Length:52} Data:[91 2 32 6 36 1 10 14 20 35 246 84 21 35 246 84 19 35 118 23 18 35 246 84 3 0 0 0 160 37 38 0 160 15 0 66 160 37 38 0 160 15 0 66 163 3 1 0 32 2 36 1] Pos:0}] 2023/11/23 19:57:24 got large forward open from 127.0.0.1:36990 2023/11/23 19:57:24 forward open msg: {cipService_LargeForwardOpen 2 32 6 36 1 10 14 1425416980 1425416981 8979 6006 1425416978 3 2500000 1107300256 2500000 1107300256 163 3} @ [1 0 32 2 36 1] 2023/11/23 19:57:24 New Managed Connection. {ID:8979 OT:1425416980 TO:1425416981 RPI:2.5s Path:[1 0] Open:true} 2023/11/23 19:57:24 context: 0 2023/11/23 19:57:24 ih: 0. timeout: 0 2023/11/23 19:57:24 got connection id 1425416980 = &{ID:8979 OT:1425416980 TO:1425416981 RPI:2.5s Path:[1 0] Open:true} 2023/11/23 19:57:24 Trying to read testtag4 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag6 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag1 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag3 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag5 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag7 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag8 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag9 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag10 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 Trying to read testtag2 from MapTagProvider 2023/11/23 19:57:24 not a slice 2023/11/23 19:57:24 send unit data service requested: cipService_MultipleService 2023/11/23 19:57:24 context: 0 2023/11/23 19:57:24 ih: 0. timeout: 0 2023/11/23 19:57:24 Got unknown service at send unit data handler 3 2023/11/23 19:57:24 send unit data service requested: cipService_GetAttributeList

danomagnum commented 7 months ago

When you enable the Timeout, it starts a background task that polls the PLC to keep the connection alive. It polls with cipService_GetAttributeList instead of the normal reads or writes. The server doesn't handle that so it ignores it.
Unfortunately that means it never replies to the client so the client sits there waiting for the response to come back.

That's a pretty important part of the server I need to get fixed. #13

In the meantime, version 0.20.1 should have timeouts on all the client reads and writes so you should be able to recover.

0.20.1 is live on github, but hasn't made it to the go package mirror yet. Should be there by the morning.

nicola1695 commented 7 months ago

I'm testing version 0.20.2 and it seems that your addition of timeout in read/write operations solved all my problems (I set client.SocketTimeout = 1 * time.Second). In fact, thanks to the production of the timeout error, the reading cycle no longer enters freezing situations and can retry the connection. Thanks for your work and help!