Open soypat opened 2 years ago
After reading some termios stuff it seems this is not expected functionality: If VMIN
is zero then the function should be a purely timed function and return after 1 second.
Hey there! Good to see another sers
user. Thanks for taking the time to provide a repro case. I do see the same behavior that you are getting, i.e. the program takes 50 seconds to terminate, not 1 second as the timeout/VTIME
setting might lead one to believe.
My answer below goes into some detail, the TL;DR is this: Your issue is real, but it is unlikely that I will make it go away. There is a workaround that will give you the VMIN/VTIME
behavior you want. However there are most likely better ways to achieve your goals. Finally I back up my arguments with some strace
s.
Unfortunately, you stumbled into the darkest corner of the sers
library. As one line in the manual hints at, I am considering to drop support for SetReadParams
in future releases of the library. There are two reasons for this.
The first is that the interface mirrors termios, but sers
aims to support non-termios platforms as well and it is not always easy to emulate VMIN/VTIME
on other systems.
The second is that I would like to support the standard set of Set{,Read,Write}Deadline
methods that are often required by protocol implementations that work over different cases of stream transports and are the standard for I/O timeouts in the go standard library. The easiest route to do this is by letting the go runtime handle the I/O for the serial port file descriptor, using nonblocking I/O with epoll/kqueue etc. behind the scenes. Apart from deadline support this also means that I/O on serial ports does not tie up an OS thread but is scheduled by the runtime, as it is for sockets, which is a nice bonus.
The termios man page writes the following about the VMIN/VTIME
settings in conjunction with nonblocking I/O:
POSIX does not specify whether the setting of the O_NONBLOCK file
status flag takes precedence over the MIN and TIME settings. If
O_NONBLOCK is set, a [read(2)](https://www.man7.org/linux/man-pages/man2/read.2.html) in noncanonical mode may return
immediately, regardless of the setting of MIN or TIME.
Thus using nonblocking I/O as I wish leaves VMIN/VTIME
behavior unspecified. As you found out, nonblocking I/O seems to override your settings. There are two traces attached below to support this.
You can get around this, if you force blocking I/O on the file descriptor that underlies the SerialPort
. For instance you can modify the program to do the following.
osf, err := os.OpenFile(fn, os.O_RDWR, 0666)
if err != nil {
return err
}
f, err := sers.TakeOver(osf)
if err != nil {
return err
}
As the documentation points out, TakeOver
calls osf.Fd()
which will cause the Go runtime to not do nonblocking I/O on the file descriptor because it does not know what other code might use the file descriptor in incompatible ways. On my machine, this program terminates after 1 second with error timeout
, as you initially expected.
The workaround works in the current version of sers
and will most likely continue to work in future versions that offer SetReadParams
, but I am not guaranteeing this. Note that the workaround means that a Close()
does not unblock readers and you have to make your readers check for shutdown in some other manner.
First I have to start with a question: What do you want to achieve?
Do you want to be able to unblock serial port readers in case you shut down the code that talks to serial port? In this case it is enough to call Close()
on the SerialPort
to unblock readers.
Do you want to have a timeout on Read()
? In that case SetReadDeadline
would fit the bill. I have an untested implementation lying around on my machine for literally years and I could expedite merging it if this would help you. Please get in touch if this is the case.
In order to get a grip on the behavior, I used strace
which prints out a trace ("a log") of all the syscalls a program makes. I ran the program as strace -tt -f -v ./readclose [ttyfilename]
.
The gist is this: (Timeout in the closing goroutine set to 4 seconds, not 50)
[pid 8828] 09:02:32.955015 openat(AT_FDCWD, "/dev/ttyS4", O_RDWR|O_NOCTTY|O_NONBLOCK <unfinished ...>
[pid 8828] 09:02:32.955391 <... openat resumed> ) = 3
[pid 8828] 09:02:32.955756 epoll_ctl(4, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3692930888, u64=140324569458504}}) = 0
[pid 8828] 09:02:32.956017 ioctl(3, SNDCTL_TMR_START or TCSETS, {c_iflags=0, c_oflags=0x4, c_cflags=0xcbd, c_lflags=0xa30, c_line=0, c_cc[VMIN]=0, c_cc[VTIME]=10, c_cc="\x03\x1c\x7f\x15\x04\x0a\x00\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
[pid 8828] 09:02:32.956195 read(3, <unfinished ...>
[pid 8828] 09:02:32.956222 <... read resumed> 0xc00001e100, 128) = -1 EAGAIN (Resource temporarily unavailable)
[pid 8828] 09:02:36.958858 write(1, "============================> cl"..., 40============================> close now
[pid 8828] 09:02:36.959449 close(3 <unfinished ...>
The file /dev/ttyS4
is opened and assigned fd 3. The Go runtime decides to do nonblocking I/O and registers the fd with its epoll instance to get notifications. Then you see the VTIME
setting taking place, VTIME=10. The Go runtime reads from the FD, which is in nonblocking mode, and immediately gets EAGAIN
, i.e. no data. This already hints at VTIME
being ignored. No I/O happens on the file descriptor as epoll never notifies the runtime of anything. In other words, epoll does not respect the VTIME
setting. After the SerialPort
was closed, the FD 3 is closed as well.
Read and epoll seem to ignore VTIME
in nonblocking mode, which explains the behavior you are seeing.
Again the gist:
[pid 10109] 09:08:24.899707 openat(AT_FDCWD, "/dev/ttyS4", O_RDWR|O_CLOEXEC <unfinished ...>
[pid 10109] 09:08:24.900539 <... openat resumed> ) = 3
[pid 10109] 09:08:24.901344 epoll_ctl(4, EPOLL_CTL_ADD, 3, {EPOLLIN|EPOLLOUT|EPOLLRDHUP|EPOLLET, {u32=3761674056, u64=139950976046920}} <unfinished ...>
[pid 10109] 09:08:24.901528 <... epoll_ctl resumed> ) = 0
[pid 10109] 09:08:24.903527 ioctl(3, TCGETS, {c_iflags=0, c_oflags=0x4, c_cflags=0xcbd, c_lflags=0xa30, c_line=0, c_cc[VMIN]=0, c_cc[VTIME]=10, c_cc="\x03\x1c\x7f\x15\x04\x0a\x00\x00\x11\x13\x1a\x00\x12\x0f\x17\x16\x00\x00\x00"}) = 0
[pid 10109] 09:08:24.904183 read(3, <unfinished ...>
[pid 10109] 09:08:25.909699 <... read resumed> "", 128) = 0
The file is opnened and added to epoll by the Go runtime as well. However, after calling .Fd()
calls to .Read()
do do read()
s directly, without waiting for readiness notification by epoll. You can see this in the read()
call taking place uncoditionally and taking 1 second to complete, matching the VTIME setting.
I'm sorry for not answering sooner, how rude of me.
I'll be back and working with this repo sometime within the next 20ish days. My use case is the following: I have an actuator that must be initialized and then is ready to take commands via Write and reply status via Read. Below is a sketch showing the current implementation:
If the actuator at any point is disconnected or there is no actuator at initialization the whole program hangs indefinitely. I have only detected this issue when trying to run the program with a disconnected actuator. If the actuator is connected the program works perfectly fine.
My worry is that if during a mission this actuator for one reason or another stops replying a goroutine is leaked in the best case, and in the worst case the whole program is halted. It'd be nice if there was a read timeout.
Before your reply I was planning on having a watchdog checking elapsed time during a read transaction and cycling the file if the timeout was triggered (closing and reopening the file). I am curious as to how you'd implement the Read timeout though.
Thank you for your quick reply and greatly detailed explanation <3!
No worries, life happens.
Regarding your code: Yes, you do want some kind of timeout. There are multiple ways to go about it.
Working a bit with your code and assuming there's SetReadDeadline()
method (e.g. from a net.TCPConn
, but bear with me) on your p.Comm
, I can slightly rewrite it as show below.
This program will not block indefinitely, assuming the write completes. Notice how I used io.ReadFull
instead of a plain Read
.
The latest release of sers
doesn't have a SetReadDeadline
on the SerialPort
, unfortunately. However if you are using Linux or OS X, you might be in luck. You can try using the remove-cgo
branch of the library, i.e. run go get -u github.com/distributed/sers@remove-cgo
on your project. You can try a type conversion on the SerialPort
to get access to the deadline methods, see: https://github.com/distributed/sers/blob/remove-cgo/verification/readdeadline/readdeadline.go. I plan to introduce the deadline methods to the Windows port as well, but I won't and can't promise you a timeline. The advantage of using the deadline methods is that they are what the net.Conn
interface provides so you can easily swap out a serial port for e.g. a TCP connection in your programs, which can be handy.
Barring adding a read deadline as described above, you still have options. I use sers
in various setups and typically I do the following. The described approach works even without deadlines.
I have a goroutine that reads messages (i.e. packets, delimited groups of bytes) from the serial port and sends them on a channel. This goroutine continues until it gets an error reading from the serial port. If I want to stop the reading goroutine, I close the serial port which will lead to the Read()
returning to the read goroutine. This functionality is what is verified by verification/readclose.go
. The goroutine looks something like this:
// readGoroutine reads whole packets and sends them on channel c. If there is
// no actor to receive from c, the channel done should be closed.
// This function ties can be started with an errgroup (golang.org/x/sync/errgroup)
// with eg.Go(func() error { readGoroutine(sp, packchan, ctx.Done()) }
func readGoroutine(sp sers.SerialPort, c chan <- *Packet, done <-chan struct) error {
for {
pack, err := readFrame(sp)
if err != nil {
return err
}
select {
case c <- pack:
case <-done:
}
}
When you want to send-then-receive you could do the following:
func exchange(ctx context.Context, out []byte) (*Packet,error) {
_,err := sp.Write(out)
if err != nil {
return nil, err
}
select {
case pack := <- packchan:
return pack, nil
case <-ctx.Done():
return nil, io.EOF
case <-time.After(100*time.Millisecond):
return nil, fmt.Errorf("timeout while waiting to receive packet")
}
}
To add two things that are not sers
-specific, but might still be of help: It pays to have some kind of message definition, .e.g. "first a byte 0xfe, then a byte N that indicated the length of the payload, N payload bytes, then a CRC-16 of all the N payload bytes" and use that for communication.
Consider what happens if your serial line happens to be noisy and your computer receives an extra byte: This means that you can read e.g. 11 instead of 10 bytes. After you read your 10 byte message, there is still one extra byte left in the receive buffer. The next time you receive a 10 byte message, that 1 byte will carry over etc. Barring further errors, you won't be able to receive messages indefinitely.
A similar issue can happen if the actuator responds, but the response only arrives after timeout. This can also happen because of delays at the OS or application layer. In this case, the next read operation will read an old message, with the pattern repeating indefinitely if you always read one message for each one sent.
There are many more reasons to introduce some kind of message definition.
Because of issues such as these described above it also pays to number your messages so that you know that you just received a reply to request number 37. If you receive an old reply or one that doesn't pertain to any request you made, you can ignore it.
TL;DR:
SetReadDeadline
modification to your program..Takeover()
. This is the least future safe option, though.I don't, I let them be handled by the poll
package in the Go runtime. I merely call SetReadDeadline
and friend on the serial port file descriptor (as *os.File
) and use the functionality already provided.
What happens under the hood is this: You specify a read deadline for e.g. in 2 seconds, then you do a .Read()
. The Go runtime doesn't call read()
on the serial port file descriptor, instead it adds it to the set of file descriptors watched for readability. Under Linux this mechanism uses epoll()
. When the runtime actually blocks because there nothing else to do, using epoll_wait
it specifies a timeout of 2 seconds*. If data arrives within these 2 seconds, epoll will instruct the Go runtime about the file descriptors on which I/O can be done. Otherwise it will return after 2 seconds. At this point, the Go runtime checks that the deadline has been exceeded, unblock the goroutine running Read()
and return an error to it.
Your program sees a timeout on a synchronous read operation. The Go runtime creates this behavior by using asynchronous I/O primitives and returning control to your goroutine at appropriate times with the corresponding return count/error return values.
So today I had to work with yet another actuator and settled for your library yet again. I keep coming back to it with new insights.
So I like your readGoroutine approach. The problem it has is the one you mentioned regarding the read buffer being filled with noise in between reads- I think it can be mitigated with a Flush function. I really would like a type that provides this synchronous functionality wrapping the SerialPort type. If you don't mind I'd like to PR this functionality to the repo (maybe best suited as a subpackage). Whatdya think? Also: Is there no way to see how many bytes are in the buffer ready to be read?
Here's an example of what the API could look like:
I would not work with time based ways of distinguishing messages. It limits message throughput and whether your protocol works is dependent on how timely data is being delivered to your program. Time based approaches work well for real time systems where you have tight bounds on when data arrives. A scheduling hiccup in your PC can easily move timing around by dozens or hundreds of milliseconds.
As I mentioned above, I believe the best way to find distinct messages in a byte stream is to define what a message looks like. The following article has a nice overview: https://eli.thegreenplace.net/2009/08/12/framing-in-serial-communications
Note how the article discounts time based approaches as approach (1).
Usually I go for a message definition something like the following:
ff [payload len N: 0-255] payload*N [checksum of the previous bytes]
To read a message you write a function that looks for a 0xff
byte, then reads another byte and interprets it as a length N
, then reads N
bytes, puts them into a buffer, then calculates a checksum of the previous bytes and compares it to a received checksum. If the calculated and received checksums match, it is very likely your received a correct message, if so, deliver it to your program, otherwise drop.
Even if your data transmission is disturbed, the above method has a good practical chance to lock onto a subsequent ff
and start receiving correct data again*.
Concerning your code, I would make Tx
wait in a select
, not in a while xxx { sleep }
loop. Note in my example:
case pack := <- packchan:
return pack, nil
case <-ctx.Done():
return nil, io.EOF
case <-time.After(100*time.Millisecond):
return nil, fmt.Errorf("timeout while waiting to receive packet")
This implements waiting for the next message on packchan
, returning if nothing arrives withing 100 ms and is even cancellable through a context.
Thanks for the offer of the PR. Sers is a library focussed on providing access to serial ports, not to framing. If you would like to publish your code, I suggest you do it in a separate repository on your account.
Regarding your question. No, there is no way to know how much data is ready to read. You can kind of fake it if your programs reads into a buffer and you query that buffer's length. I am not sure what you want to achieve. Knowledge about the the amount of readable data is rarely required in Go programs.
ff
bytes and make the receiver search from frames in the middle of frames. Note whatever the framing scheme, it is not possible to guard against all possible errors. The following article has a nice overview: https://eli.thegreenplace.net/2009/08/12/framing-in-serial-communications
Thank you a billion. This was a 10/10 read.
Sers is a library focussed on providing access to serial ports, not to framing
After reading the article above this sounds super reasonable heh.
Thanks for all the tips- super valuable info. Gonna go try write a few serial framing libraries and see what the fuss is all about
Tested with raspberry Pi and my desktop:
Steps to reproduce
Modify ReadClose verification program to be as follows
Program now blocks for 50 seconds even though timeout is 1 second.
I am available to help resolving this issue but am unfamiliar with this repo and CGo.