bugst / go-serial

A cross-platform serial library for go-lang.
BSD 3-Clause "New" or "Revised" License
622 stars 191 forks source link

slow sends on Windows (USB) #15

Closed jcw closed 7 years ago

jcw commented 7 years ago

Hello Cristian,

Your serial package is getting better and better all the time - first off, a huge thanks for creating this, covering all the platforms, and doing so in pure Go. As you may remember, I ran into an issue (#9) on macOS a while back, and am happy to report that it has been working charmingly ever since.

On Windows, there's still an issue we can't quite figure out - some details here, but that won't be very helpful yet, I'm afraid.

The behaviour is that communication is really slow - one line of text is sent, and then I need to wait for its echo and prompt coming back. This cycle repeats until an entire file is sent - it works quickly on Linux and Mac, but not on Windows.

Someone mentioned seeing this behaviour elsewhere, and that it might be related to hardware handshaking (this is USB, so all virtual). The USB device on the other end is custom-coded (not C), and completely ignores RTS/CTS etc, so I'm not really sure where to look next.

Anyway, just wanted to flag this here as placeholder, until we can figure out a bit more. The problem has been reported by at least two people, both Win10 AFAIK.

Cheers, Jean-Claude

PS. Saw your new work on a new USB API - no rush, but I'm looking forward to seeing that in the main branch eventually. It means I could provide device info in the port selection menu I'm currently presenting in Folie. Fantastic work, all of it. Many thanks for taking it further and sharing it all.

jcw commented 7 years ago

To follow up - we're seeing a 1-second slowdown after every line, but it happens only on Windows. Could it be related to this logic? What is the purpose of the timeout? Another aspect which might be related is that incoming text lines end in LF, not in CR+LF in our setup.

cmaglie commented 7 years ago

Hi @jcw

the purpose of that timeout is to make the following check working:

https://github.com/bugst/go-serial/blob/v1/serial_windows.go#L76-L84

basically, it seems that the only realiable way, on Windows, to check for a "disappearing port" (for example when you unplug a USB cable) is to do a setCommState on the port and see if it fails. Any other system call to the already-opened file handle will be successfull without any error, in particular the Read call, without a timeout will block forever.

So the current code do something like this:

Hope this clarify a bit.

Going back to the original issue: I would expect the Read call to return immediately, with the new incoming data and not timeout after 1 sec. So in some way it seems that the first call to Read "miss" the incoming data that get caught on the second call (after testing if the port is still alive).

Can you write a small example that demonstrates the problem? At this point it's very important to produce a small test case to reproduce the error, this will allow to narrow and identify if the problem is in the library, in the hardware or in the OS.

SevenW commented 7 years ago

Hi, I am looking together with @jcw into the slow-serial on windows we have. What I now see is that a call to Write() is actually only returning after 1 second. My guess is that there is a previous Read() call in progress. As no data flows in, it hits a timeout of one second. Then the Write() can be executed.

It would be nice if Read() and Write() could be concurrent, but it does not seem to be so.

What we are doing is having a main program sending commands and checking replies. We have two goroutines. One forever Read() the serial port and if data available it is pushed into a channel (read by the main) Second: forever monitoring a channel, for data to be send and if available Write().

If my theory is right, then I am looking for a non-blocking Read(). Is that available in the package?

Update: Looking a bit further, I think the mechanism is indeed: Read pending, no incoming data Write() requested. Write has to wait until the syscall.Readfile returns after a timeout. The timeout is not an error status (because I do not receive err other than nil). Write executed. The serial device at the other side of the lines echoes the command. Read() now receives data and finally returns without error.

In our application we have to wait the full timeout of one second before the write can happen because we start reading right after the previous command resulted in its final output.

As it is a serial terminal application, continuous reading is a required property.

SevenW commented 7 years ago

So in some way it seems that the first call to Read "miss" the incoming data that get caught on the second call (after testing if the port is still alive).

This could be programmed more tidy I am sure. The chosen approach of reading forever helps flushing unwanted data, that may come for whatever reason. So another option would be to have a Flush() or other means to discard unwanted data.

So what happens if the timeout on the Read() is really short? Like 10ms?

SevenW commented 7 years ago

Proposal:

ReadTotalTimeoutConstant:    10, // 10 msec

and fire the getCommState setCommState combo every hundred iteration of the for-loop, to maintain a once every second pace for alive check.

cmaglie commented 7 years ago

I am looking together with @jcw into the slow-serial on windows we have. What I now see is that a call to Write() is actually only returning after 1 second. My guess is that there is a previous Read() call in progress. As no data flows in, it hits a timeout of one second. Then the Write() can be executed.

Ah, so it's the Write that is unexpectedly blocking, do you think that a test like:

port := serial.Open(....)
go func() {
    time.Sleep(10)
    port.Write([]bytes(0x70))
    time.Sleep(10)
    port.Write([]bytes(0x70))
    time.Sleep(10)
    port.Write([]bytes(0x70))
    time.Sleep(10)
    port.Write([]bytes(0x70))
    time.Sleep(10)
    port.Write([]bytes(0x70))
    fmt.Println("Done writing!")
}()
buff := make([]bytes, 1024)
port.Read(buff)

is enough to trigger the bug? (Done writing! should appear almost immediately, but if the bug is triggered it should appear after 5 seconds). I can't try this right now, I could make the test above tomorrow.

It could be that my library must use overlapped I/O to allow concurrent read and write. https://msdn.microsoft.com/en-us/library/ff802693.aspx it's a bit more convoluted code for read and write but if this is necessary to use concurrent I/O I'll do it during this weekend.

Proposal: ReadTotalTimeoutConstant: 10, // 10 msec and fire the getCommState setCommState combo every hundred iteration of the for-loop, to maintain a once every second pace for alive check.

This will make a CPU intensive spin-loop, even when the serial port is idle (100 calls to getCommState/setCommState per second). BTW you can use this as a temporary workaround until the bug is properly fixed.

cmaglie commented 7 years ago

If my theory is right, then I am looking for a non-blocking Read(). Is that available in the package?

it's not available. This is by design: your current approach (two concurrent goroutines, one for reading and one for writing) is the suggested way to handle streams without timeouts, IMHO this is the go-way to do it. :+1:

I'm resisting the temptation to add non-blocking read/write (also because implementing timeouts in a consistent way across the various OS is a bit complicated :stuck_out_tongue:)

SevenW commented 7 years ago

This will make a CPU intensive spin-loop, even when the serial port is idle (100 calls to getCommState/setCommState per second). BTW you can use this as a temporary workaround until the bug is properly fixed.

No, that is not the way I intended it. I already feel that there is a lot of USB-control messages flying over the line.

I tested the following code in the Read() function, and it works like a charm (although I did not look at CPU load).

func (port *windowsPort) Read(p []byte) (int, error) {
    var readed uint32
    var n uint8
    n = 0
    params := &dcb{}
    for {
        if err := syscall.ReadFile(port.handle, p, &readed, nil); err != nil {
            return int(readed), err
        }
        if readed > 0 {
            return int(readed), nil
        }

        // At the moment it seems that the only reliable way to check if
        // a serial port is alive in Windows is to check if the SetCommState
        // function fails.
        if n >= 100 {
            getCommState(port.handle, params)
            if err := setCommState(port.handle, params); err != nil {
                port.Close()
                return 0, err
            }
            n = 0
        }
        n++
    }
}

and

    // Set timeouts to 10 millisecond
    timeouts := &commTimeouts{
        ReadIntervalTimeout:         0xFFFFFFFF,
        ReadTotalTimeoutMultiplier:  0xFFFFFFFF,
        ReadTotalTimeoutConstant:    10, // 10 msec
        WriteTotalTimeoutConstant:   0,
        WriteTotalTimeoutMultiplier: 0,
    }

Bottomline: getCommState is called once a second as before, and a Write() has to wait at most 10ms before a pending readline returns.

The problem is sending a write when a read is waiting for characters to arrive. Only after the timeout to write seems to get executed. I had some trace logging around Read() and Write() calls that showed up like this:

read
 <--- actual call to Read() here
write
 <--- actual call to Write() here
 <--- one second delay here!
writed (=return from Write())
readed (=return from Read())
SevenW commented 7 years ago

I did some performance measurements with different timeouts, and in some tests I commented out the getCommState/setCommState The baseline is running the same go program on raspberry pi 2.

raspberry pi                                 19s
win10 timeout 1000ms                       1477s
win10 timeout   10ms                        153s
win10 timeout    1ms - no CommState          86s
win10 no timeout     - no CommState          26s

As you can see the timeout of one second is really bad for performance, but reasonable performance only is achieved without the timeout mechanism.

cmaglie commented 7 years ago

Oh sorry just noticed that you suggest to call get/setCommState every 100 iterations, anyway you can't avoid to call Read, if another way to fix this exists I'd like to avoid the need for tight polling-loops.

Can you check if the snippet I posted above (https://github.com/bugst/go-serial/issues/15#issuecomment-273600960) triggers the bug?

SevenW commented 7 years ago

Can you check if the snippet I posted above (#15 (comment)) triggers the bug?

Yes it does.

cmaglie commented 7 years ago

Yes it does.

Thanks! I've made this patch #16, it solves the problem on the test, may you try if it works for your app?

SevenW commented 7 years ago

Issue Solved! I am really happy with the result. It is on par with linux now:

raspberry pi                                 19.5s
win10 overlapped-io                          21.4s

Thanks!

cmaglie commented 7 years ago

Great news!

I'd like to point out how important is to have a small test program to reproduce the error:

Finally, thank you for the detailed bug report and testing :-)