Closed sipsma closed 4 years ago
Change https://golang.org/cl/178919 mentions this issue: syscall: use Ctty before fd shuffle
I'm starting to wonder whether this is actually a bug. The change in behavior has broken a couple of existing programs.
Perhaps we should instead clearly define the Ctty
field as being the file descriptor number in the child, not the parent. Then the code before CL 178919 would work fine, provided we can always assume that the ctty should be a descriptor that is open in the child. If you wrote your code with that understanding, would that clarify matters?
Marking as release-blocker to decide whether to keep or revert the change.
@ianlancetaylor I see where you're coming from but still have some concerns.
IIRC The behavior before CL 178919 was that sometimes the parent FD was used and sometimes the child FD was used. So even if you go with the model of Ctty specifying the child FD, I think there would still need to be a different fix to ensure that it's only the child FD that is ever used.
Given the above, I wonder if changing this behavior with that different fix would end up just breaking a different set of existing programs. It seems entirely possible to me that programs out there are using this interface but their parent process and child process always have FDs open in the right pattern such that their child ends up using a parent FD as a Ctty.
It seems to me that before CL 178919 we always used the child FD. In what circumstances would you say that we used the parent FD?
In my original post, the first test case, runChildWithCtty(5, 6, 7)
, results in the parent FD 7 being used as the Ctty of the child. FD 7 isn't set in the child process.
The second test case, runChildWithCtty(10, 11, 11)
has the behavior you're describing where the child FD is used. I think it's the inconsistency in behavior that's the real bug.
OK, I'm not sure what happens when the controlling terminal is closed. Is it meaningful to set the ctty of a process and then close that descriptor?
Yeah I believe there's no requirement at least on Linux (not sure about Posix in general) that the child actually keep the FD open.
The original context in which I encountered this bug was a parent process on Linux that managed a bunch of children, each one of which was in its own session with its own Ctty. The parent process kept the Ctty FD open after the child spawned until it wanted the child process's session to end, at which time the parent closed the Ctty (which results in SIGHUP being sent to every process in the child's session). This worked consistently in the "use parent FD as Ctty" case. It was a very rare occurrence that the code would randomly hit the "use child FD as Ctty" case and then things would break.
I can see that others were probably in the reverse situation and relied on the "use child FD as Ctty" case, but it seems like this may be a choice between who you want to break. I'm obviously a bit biased, but the "use parent FD as Ctty" behavior seems more straightforward. You don't need to coordinate between the Ctty and ExtraFiles fields in the parent and the child doesn't need to have an extra FD sitting around it may not have any use for.
Thanks, that seems persuasive.
This change has broken https://github.com/google/goexpect/, and I'm not certain that package was doing anything wrong.
The goexpect package sets up a SysProcAddr here:
cmd := exec.Command(command[0], command[1:]...)
// This ties the commands Stdin,Stdout & Stderr to the virtual terminal we created
cmd.Stdin, cmd.Stdout, cmd.Stderr = pty.Slave, pty.Slave, pty.Slave
// New process needs to be the process leader and control of a tty
cmd.SysProcAttr = &syscall.SysProcAttr{
Setsid: true,
Setctty: true}
SysProcAttr.Ctty
is 0
. Prior to this change, that sets the controlling terminal to cmd.Stdin
(since Ctty
refers to the FD in the child). After this change, it sets the controlling terminal to os.Stdin
(Ctty
refers to the FD in the parent).
Perhaps it would have been best if SysProcAttr.Ctty
had always referred to a FD in the parent, but unless I'm missing something changing it breaks working code that wasn't doing anything wrong.
@sipsma
IIRC The behavior before CL 178919 was that sometimes the parent FD was used and sometimes the child FD was used.
I don't think this is correct; from my read the controlling TTY was always set in the child immediately before calling exec
.
@neild I don't think there's any debate that the Ctty was always consistently set in the child. The issue was whether the FD being used to set the Ctty was a reference to an FD opened in the parent or a reference to an FD opened in the child.
The behavior before the fix was inconsistent; if you specified Ctty: 4
(for example), sometimes you would end up referring to what the parent has open at FD 4, sometimes you would end up referring to what the child has opened as FD 4.
Given this inconsistency needs to be fixed and will end up breaking behavior no matter what, my preference is the behavior that results in the clearest+simplest interface and the one with the least extraneous requirements, but that decision is up to the Go maintainers obviously. I'll just also mention that the code you posted would be updated to
cmd.SysProcAttr = &syscall.SysProcAttr{
Setsid: true,
Setctty: true,
Ctty: pty.Slave}
so while I agree the fix breaks your existing code, the updates required are at least pretty minimal and result in being more explicit (if that's any consolation).
This does bring up the point that it would probably be highly beneficial for the docs to make the behavior very explicit. Right now they are pretty ambiguous as to whether the Ctty FD is a reference to the parent or child, which is part of the source of confusion here: https://golang.org/pkg/syscall/#SysProcAttr
The behavior before the fix was inconsistent; if you specified Ctty: 4 (for example), sometimes you would end up referring to what the parent has open at FD 4, sometimes you would end up referring to what the child has opened as FD 4.
If you specified Ctty: 4
, how would you get anything other than the child's FD 4? The controlling terminal is set in the child immediately before calling exec
. It is a reference to a FD in the child.
Perhaps your point is that you don't have control over what FD 4 in the child is. But you do have control over FDs 0, 1, and 2, which is precisely the case which is broken here.
Perhaps your point is that you don't have control over what FD 4 in the child is.
Yes, in fact if you are a user just going by the docs, FD 4 is not even supposed to be open in the child (once the final child process is exec'd it will be closed via CLOEXEC
). That's why it seemed entirely reasonable to assume that the FD must refer the FD as set in the parent. It wouldn't make sense to rely on a 100% internal implementation detail such as the fact that an FD from the parent still happens to be opened in the child process at a given time.
Again, I agree your code is broken by this change as is, but switching the behavior the other way (make Ctty
always refer to child FDs`) is still a breaking change, just for a different set of cases.
Also, your suggested fix is not universally accurate (although it would work here).
Consider existing code:
cmd.Stdin, cmd.Stdout, cmd.Stderr = f0, f1, f2
cmd.SysProcAttr = &syscall.SysProcAttr{
Setsid: true,
Setctty: true,
Ctty: 0,
}
In Go 1.12 and earlier, this will always and consistently set the child's ctty to f0
. (First f0
is shuffled to FD 0, second we set the ctty.)
After the change to redefine Ctty
, this needs to be written as:
cmd.Stdin, cmd.Stdout, cmd.Stderr = f0, f1, f2
cmd.SysProcAttr = &syscall.SysProcAttr{
Setsid: true,
Setctty: true,
Ctty: f0.Fd(),
}
This will also work in Go 1.12 and earlier, unless f0
's FD in the parent is 1 or 2. In Go 1.12 we will:
f0
to FD 0, setting the original FD of f0
to close-on-exec.f1
and f2
to FDs 1 and 2. If f0
's original FD is 1 or 2, it will be clobbered.f0
's original FD is.This is perhaps an obscure case, but it indicates that there is no general way to write code which is safe with both the old and new APIs. (Also, it's the exact corner case which motivates this change.)
Again, I agree your code is broken by this change as is, but switching the behavior the other way (make Ctty always refer to child FDs`) is still a breaking change, just for a different set of cases
Prior to this change, it is possible to write code which safely sets the controlling terminal in the child.
After this change, it is very difficult to write such code which is correct with both older and newer Go versions.
After this change, it is very difficult to write such code which is correct with both older and newer Go versions.
Agree, you would need an implementation for go1.12 and older and a separate one for go1.13, which can be done with build tags AFAIK but is nonetheless unfortunate as it's code that's going to need to sit around until you are willing to tell your users that go 1.13+ is a strict requirement.
However, for the other set of users who would be broken by a switch to always using the child FD, they now would need to permanently have code that sets the Ctty FD in ExtraFiles and update their child process to close the file they don't actually need (or accept that another file is going to be opened in the child and possibly be passed down to descendants).
Either way, some set of users of the library are going to end having to include more extraneous code in their codebase because of the previous inconsistent behavior. It's just going to come down to which the maintainers want to impose I suppose.
for users broken by the switch to always using the child FD,
This is not a switch. This is the previous behavior. SysProcAttr.Ctty
has always referred to a file descriptor in the child. CL/178919 redefines it, breaking previously correct programs. This change should be reverted.
SysProcAttr.Ctty has always referred to a file descriptor in the child.
But it was inconsistent whether it referred to a file that was actually specified by the user to be opened in the child or it was another FD that temporarily existed due to an internal implementation detail, which is what needed to be fixed.
I've made my views clear and can't personally make a judgement call on which users the maintainers should break. I would just would say that if CL/178919 is reverted, then there still needs to be another fix to make Ctty consistently refer to a FD that is specified in ExtraFiles (which like I said before will just end up breaking another set of existing users and impose extra requirements on them).
I hadn't considered the necessity of calling the Fd
method, which has side effects. I think it may be better to revert CL 178919, document Ctty
as being the child's descriptor number, and accept that there is no way to start a child with a closed ctty descriptor. Perhaps for 1.14 we can return an error if Ctty
is set to a descriptor number that does not match Files
.
@ianlancetaylor I can understand (though respectfully disagree with) reverting CL 178919 and replacing it with a change that returns an error in the case where ExtraFiles doesn't have the Ctty being specified.
However, only reverting it and not fixing the inconsistent behavior (one way or another) concerns me as a user of the library. For me, this appeared as a rare bug that just caused an error, but there are other scenarios in which this inconsistent behavior is worse.
Say a user of the library meant to set a pty file as FD 3 via ExtraFiles and set Ctty: 3
, but they made a mistake and forgot to append it or accidentally provided nil to ExtraFiles. Without any fix, the child will now subtly end up using the parent's FD 3. If that FD is a tty for something else, what should have just been an error instead succeeds and results in a child process running with a random other controlling TTY that just happened to also be open at the time in the parent. The implications here run the gambit from just creating an extremely hard to diagnose bug to potential security concerns. This really seems like something that should be addressed in the code, not just docs, if at all possible before 1.14.
I think this change has proven to be fragile, breaking various existing packages, and I think it's too late in the 1.13 release cycle to mess with it. That is why I suggest fixing it in 1.14. While I agree that the current behavior has its problems, at least 1.13 won't be any worse than any previous release.
Okay, that's your call, then I'd just ask that it's clearly documented that it's Undefined Behavior if you set Ctty to an FD that was not set in Stdin/Stdout/Stderr or ExtraData (might be an error, might succeed with a random other FD from the parent).
Change https://golang.org/cl/183939 mentions this issue: Revert "syscall: use Ctty before fd shuffle"
For 1.14, I'd suggest:
Ctty
as referring to an FD in the child.os.StartProcess
when ProcAttr.Sys.Ctty >= len(ProcAttr.Files)
. This may break some existing programs, but only ones relying on an undocumented internal implementation detail; the value in catching subtle bugs is probably worth it.syscall.SysProcAttr
that allows specifying the FD in the parent; perhaps a bool indicating that Ctty
is an FD in the parent. Alternatively, say that we don't support that and you need to pass the ctty FD to the child.We should make doc changes and add a test for 1.13. Functionality changes like returning an error should wait for 1.14.
I assume eventually we want to fix parent-FD assumptions in:
golang's src/os/signal/signal_cgo_test.go:
cmd.SysProcAttr = &syscall.SysProcAttr{
Setsid: true,
Setctty: true,
- Ctty: int(slave.Fd()),
}
https://github.com/kr/pty/blob/b6e1bdd4a4f88614e0c6e5e8089c7abed98aae17/run.go#L50
I just noticed that if the SysProcAttr.Foreground
field is set, then all three implementations look at .Ctty
before adjusting the file descriptors. And, as noted above, if SysProcAttr.Setctty
is set then they look at .Ctty
after adjusting the file descriptors. Argh.
Change https://golang.org/cl/185242 mentions this issue: syscall: document that Ctty is a child descriptor
I just noticed that if the SysProcAttr.Foreground field is set, then all three implementations look at .Ctty before adjusting the file descriptors. And, as noted above, if SysProcAttr.Setctty is set then they look at .Ctty after adjusting the file descriptors. Argh.
So in the case that Foreground
is set to true, Ctty
is always a reference to the file as set in the Parent's FD table? Yeah, that definitely worsens the inconsistency quite a bit (and makes it much harder to accurately document what Ctty
actually means).
Looking around, it looks like a golang unit test currently relies on that behavior. The u-root
code @gthelen linked to before also appears to have had this assumption for a while.
Given the whole previous discussion plus this new finding, is it in the cards to just leave all the current behavior as is for the sake of backwards compatibility but in Go 1.14+ deprecate the fields related to Ctty
(or perhaps SysProcAttr
as a whole) and introduce a new interface that has more consistent and flexible behavior?
Yeah, I don't think there is anything we can do in the 1.13 timeframe here. Even the doc change is wrong.
Well, we didn't get this for 1.14 either. Sorry.
This issue is currently labeled as early-in-cycle for Go 1.15. That time is now, so friendly ping. If it no longer needs to be done early in cycle, that label can be removed.
Change https://golang.org/cl/229768 mentions this issue: syscall: document exact meaning of Ctty field
@ianlancetaylor Haven't looked at this in a while but I saw the change you posted. Can't comment on the CR directly, so commenting here:
Foreground
and Setctty
from being set to true
? If not, I think the behavior when both are true
is even worse than what the doc update says. On this line Ctty
would refer to an fd before the shuffle, but later here it would refer to an fd after the shuffle, all within a single call...Ctty **must** be a descriptor number in the child process: an index into ProcAttr.Files
is misleading because there are still situations where Ctty
isn't an index into ProcAttr.Files
but the call succeeds nonetheless.
Setctty
to potentially succeed. I feel the undefined behavior of this situation is important enough to mention if it isn't going to be fixed.The change I sent out is intentionally only a doc change. Any code changes will follow. I'm making the doc change separately since we may have to revert any code changes.
There is currently no check that prevents setting both Foreground
and Setctty
, but in general it doesn't make sense to set them both. I'm not worried about existing programs using that case.
I don't think using "must" in the doc comment is misleading. This kind of doc comment describes the requirements that callers must follow in order to get the results they expect. Callers that do not follow those requirements get unpredictable results. It's not necessary for these sorts of docs to describe the nature of the unpredictability.
In any case I do plan to make a code change to verify the Ctty
field in the Setctty
case.
Callers that do not follow those requirements get unpredictable results. It's not necessary for these sorts of docs to describe the nature of the unpredictability.
It's pretty common in my experience for a doc to say "otherwise, an error is returned" or "otherwise, behavior is undefined" in order to disambiguate. The cost of doing so is basically 0 (as far as I can tell anyways). I'm not suggesting documenting the details of what happens, just that it's undefined behavior as opposed to an error.
in general it doesn't make sense to set them both.
I agree that based on the posix definitions Setctty
should imply Foreground
, but from an outside perspective that would make me assume that setting both to true should be harmless. If you, for example, had some base default SysProcAttr
where Foreground
is filled out but then in a conditional branch also fill in Setctty
/Ctty
, it would be very surprising that you also need to explicitly set Foreground
to false. This, again, is not something I'm suggesting code changes for or tons of docs on, just explicitly saying behavior is undefined if both are set would be helpful for anyone trying to use this interface safely.
Either way, I appreciate you following up on this issue. I'm just offering these suggestions in order to give the perspective of an external user trying to make sense of all this, in case that's valuable.
Thanks, but I'm comfortable just saying "must" and not specifying what happens if the constraint is violated.
Change https://golang.org/cl/231638 mentions this issue: syscall: if Setctty, require that Ctty be a child descriptor
It seems that this change breaks the following patch to github.com/creack/pty:
https://github.com/creack/pty/pull/75
Which is necessary to support interactive programs which open /dev/tty
for user-interaction and consume actionable data from stdin, such as less, fzf, vipe, and so on. This breaks the user interaction model of aerc.
@ddevault If tty
is going to be open in the child process, then it must have a file descriptor in the child process. When using os/exec.Cmd
as the creack/pty code does, tty
must be in Stdin
or Stdout
or Stderr
or ExtraFiles
. If it isn't in any of those, then it will be closed when the child process runs. If it is in one of those, that gives you the file descriptor number that you should use in Ctty
.
Right now the creack code is setting Ctty
to the parent's file descriptor number. That is working (assuming that it is working) by accident. The correct fix is to ensure that tty
is passed to the child somewhere, and set Ctty
to the appropriate descriptor number. If it is possible for all of Stdin
, Stdout
, Stderr
to be set, the simplest approach would be to always add tty
to ExtraFiles
, and set Ctty
to 3
. That will work for old versions of Go and also for the upcoming 1.15 release.
It looks like creack was able to put together this fix: https://github.com/creack/pty/pull/97
OK, looks like that version works because it leaves the Ctty
field as zero, meaning that the tty is taking from standard input.
What did you do?
In some corner cases,
cmd/exec
will start a child process with a controlling tty FD from theExtraFiles
field of Cmd instead of the FD specified inSysProcAttr.Ctty.
This occurs when the FD number of
SysProcAttr.Ctty
in the parent is 0, 1, 2 or 3+i, where i is an index that has been populated in theExtraFiles
field of the Cmd struct.This happens because the dup2 loop of forkAndExecInChild1 runs before the ioctl calls to set the Ctty. When making the ioctl calls, it's still is using the
Ctty
FD value passed in from the parent, so if the child process happens to dup2 over that FD value, theCtty
passed from the parent is closed and the child'sExtraFile
FD is used instead of theCtty
FD configured by the parent. This usually results in aENOTTY
error (unless the ExtraFile happens to also be a TTY).The following reproducing code (compiled with CGo enabled, on Linux w/ glibc) first creates a child process where the bug doesn't occur and then one where the bug does occur. When the bug occurs,
forkAndExecInChild1
will incorrectly use the parent's FD 10 (/dev/null
) when making the ioctl to setup the ctty even thoughSysProcAttr
configured it to use the parent's FD 11.This might be in something of a grey area as to whether it's a bug or expected behavior, but I'd consider this a bug because:
SysProcAttr.Ctty
field, I am providing an FD of the parent, so it's very surprising that in some corner cases it ends up instead being a reference to an FD in the child processExtraFiles
) that will be set in the child. It seems like the whole point of the interface given bycmd/exec
is to make the FD numbers of the parent and child independent of one another and only related by a mapping, saving users from having to think about corner cases of FD inheritance.dup
FDs in a loop until they no longer conflict between the Ctty field and ExtraFiles or make the FD numbers of our child process ExtraFiles dynamically configurable. Neither is ideal.What did you expect to see?
I expected both cases in the reproducing code to work, making the ioctl call to set the ctty using the
SysProcAttr
field as configured in the parent process.Running
strace -f -b execve -e 'trace=desc' ./main
, this is the (filtered) output in the working case:What did you see instead?
In the case where the bug occurs, this is the filtered strace output:
Even though the child process does call the ioctl on FD 11 at the end, 11 was previously overwritten by a dup2 call (from 13, which itself was dup2'd from 10, which is a FD configured in
ExtraFiles
, notCtty
).So the end effect is that the parent's FD 10 from
ExtraFiles
was used as theCtty
instead of the parent's FD 11.Does this issue reproduce with the latest release (go1.11.4)?
Yes
System details