Open Rinzwind opened 1 year ago
In http://gtoolkit.com we've added Socket>>setCloseOnExec:
which does just what you suggest above.
Perhaps this should be added to Pharo :-).
gtoolkit.comGlamorous Toolkit is the moldable development environment.
Ah yes, would be good to add that to Pharo: I see the implementation is in ‘GToolkit-Utility-File’, as an extension on Socket which delegates to a corresponding method on SqSocket.
As for the alternative I had in mind of explicitly mapping the file descriptors, I’ve noted there’s a method #closeAllButStandardFileStreams
in OSSubprocess though it’s not actually implemented. OSSubprocess makes use of posix_spawn
; the functions for manipulating the file_actions
unfortunately don’t seem to provide a way to close a range of file descriptors. The rationale for close
in POSIX.1-2017 also says “The standard developers rejected a proposal to add closefrom()
to the standard […] it is not possible to standardize an interface that closes arbitrary file descriptors above a certain value […].”
I would strongly support integrating this in base Pharo, and in fact making it the default for new FDs (file and socket both), requiring a setCloseOnExec: false
[^2] to get the old behavior. I suggest this because there is a potential race condition when one Process
opens a file/socket, then sets FD_CLOEXEC
, while another forks a child process. Presumably if the one doing the fork wants the FD inherited, there would be no race condition with them clearing FD_CLOEXEC
before forking[^1]. So rather than add an extra argument to a huge number of methods—consider all the implementors of [binary]writeStream[Encoded:][do:]
!—it seems better to apply a safe default, then have a single point of entry to change it back.
I see in the code that GT is using a custom primitive for fcntl()
, noting that FFI doesn't support varargs functions on ARM64 (in fact, if my research is correct, it's specifically ARM64/macOS that is different). For anyone wishing to use this code in their own application without loading, presumably, a binary plugin, it appears that on x86_64, a varargs function is called just like a fixed-args function declared with that exact pattern of args—that is, calling fcntl(my_fd,F_SETFD,existing_flags|FD_CLOEXEC)
works exactly as if fcntl()
was declared as int fcntl(int fd, int cmd, int arg)
, even though in fact the declaration is int fcntl(int fd, int cmd, ... /* arg */ )
. So we can declare the FFI method this way, bypassing the lack of explicit varargs support:
fcntl: fileDescriptor _: cmd _: arg
"Perform the specified fcntl command."
^ self ffiCall: #( int fcntl #( int fileDescriptor , int cmd , int arg ) )
And the code will then work on most platforms without the plugin. This is taking advantage of undefined behavior, so the usual caveats about nasal demons apply, but it's a workable stopgap solution.
Ultimately, given the architecture of the Pharo file/socket plugins, I imagine the solution would be a primitive as part of those plugins that performs an fcntl()
on the underlying FD without requiring the image code to traverse the private structs involved in a file handle—which would be good, as that is an obvious hack in its own right.
I could possibly even write those primitives—I've never written C, but I can read it well enough, and most of it would be boilerplate that can be copied from existing primitives—but I'd definitely want someone to look them over, and I can't actually compile the VM myself to test them (long story).
[^1]: Potentially there could be a race if multiple Processes
want to fork a child OS-process, each of which inherits only some particular FD, but that would be the case in C code as well as the flag is effectively "global", it lives on the FD rather than being part of the execve()
call, and the solution is the same—the "clear FD_CLOEXEC
, fork child, re-set FD_CLOEXEC
" sequence needs to be a mutual critical section, guarded by a Semaphore or similar.
[^2]: Though I might name it just #closeOnExec[:]
or #isCloseOnExec[:]
rather than a get
/set
pair.
Thanks Daniel. I tagged it so that Pablo can have a look. Now if people would play the game and contribute back to Pharo by sending PRs this would help us. But we will do it by ourselves. I imagine that we will discuss this is in January after holidays.
Thanks @Rinzwind for the extension link :).
Like I said, I'd be open to collaborating on this—certainly I could fill in the image side of things, referring to a primitive that doesn't yet exist but has an obvious implementation, if someone with VM experience can do that half. I could probably struggle through the VM side, too (I see I can download artifacts from a CI build, so the fact that I can't build my own VM wouldn't have to stop me entirely...), but if someone with more VM experience has time, I'd rather work on other things in the image...
I know. My remark was more general. Our VM guys will have a look.
My recommendation would be to stay close to POSIX on these things. It is normal that all fds are available in a child process. This is needed because a lot of things rely on processes share stdin/stdout/stderr and for other things. Putting a half baked idea in pharo might not be a good idea.
@daniels220 If it is about the race condition with FD_CLOEXEC there exist O_CLOEXEC and SOCK_CLOEXEC on creation time to exactly remove that problem. So if there is an FFI problem with setting these flags we should fix this
So I'm against changing a default behaviour to somehting new that just opens a lot of other problems. Parent processes can explicitly set FD_CLOEXEC before forking or child processes can clear them after the exec(). There is no such thing as a better default. Especially not if we depart from POSIX
Thanks Norbert!
Norbert, you have a point...I know I can perhaps be overeager in my opinions when I don't have experience of the full context (in this case, I haven't written C or similar, or dealt directly with POSIX APIs). Certainly it's fair to do some more research...but when I did a quick Google of "is it best practice to always use O_CLOEXEC", the top few results were mostly "yes," either "...unless you specifically know you need to share that FD," or "and if you do need to share one, clear FD_CLOEXEC immediately before fork and re-set it afterward, to avoid accidentally sharing it with a different child process". So this seems like a case where POSIX is held back by the need for backwards-compatibility, creating a footgun for new C programmers for the rest of time, while for Pharo, I think the number of programs that actually rely on sharing FDs (other than stdio) with child processes is...probably either zero or one, honestly. I would note also that you can't actually retrieve the FD of an open file without private-struct hijinks, so how would you even tell the child process what to use? I imagine it can examine the FD table itself, but that seems weird and error-prone.
Another way of looking at it: Python, Ruby, Java, and even Rust all use O_CLOEXEC on all files, with no way I saw at a glance to not do this. They expose fcntl()
, allowing you to clear the flag later, but don't bother wrapping it in any high-level API. So that's pretty strong evidence too that the programming community at large considers it a "should have been the default all along" sort of thing.
One definite point of clarification—I forgot to mention stdio—those I would agree should remain as they are even if the default for files and sockets were to change. The existing OSSubprocess library already has the ability to control if/how they are inherited in a thread-safe way, in any case.
Regarding optional use of O_CLOEXEC
and SOCK_CLOEXEC
—the problem isn't just adding support to the FilePlugin
and SocketPlugin
primitives, it's with how much surface area in the in-image API would be affected. If I execute 'foo.txt' asFileReference writeStream
, at one point the stack looks like this:
FileHandle(FileSystemHandle)>>setReference:writable:
FileHandle class(FileSystemHandle class)>>on:writable:
FileHandle class(FileSystemHandle class)>>open:writable:
FileSystem>>open:writable:
FileSystem>>binaryWriteStreamOn:
FileReference>>binaryWriteStream
FileReference(AbstractFileReference)>>writeStreamEncoded:
FileReference>>writeStream
UndefinedObject>>DoIt
The subsequent #open
in FileSystem>>open:writable:
is a no-op with a TODO flag, and the actual opening happens with the stack looking like:
File class>>open:writable: (primitive)
[] in File>>basicOpenForWrite:
File class>>retryWithGC:until:forFileNamed:
File>>basicOpenForWrite:
File>>openForWrite:
File>>openForWrite
File>>writeStream
FileHandle>>binaryWriteStream
FileSystem>>binaryWriteStreamOn:
...as above...
So if we were to...okay, let's keep things simple and say we just add an argument specifically for this and make the primitive open:writable:closeOnExec:
. (This feels dirty to me, but there don't seem to be many other similar flags, so maybe it's not so unreasonable?) Now we have 13 other methods that need a variant with a parameter in order to pass the argument down the chain. And there are at least 2 more methods on the "write" side, and something like 8 more on the read side, plus oddballs like appendStream
, that aren't part of this call chain but would need the same treatment. And what does the high-level API even look like? #writeStreamWithCloseOnExec
? Not exactly pretty...
Perhaps a more builder-styled API would be better—add a property to FileReference
so that it looks like ('foo.txt' asFileReference) useCloseOnExec; writeStream
. Then we could continue the pattern and add a similar property to FileReference
and File
. This is probably less added code overall, but it's semantically questionable—is it really a property of the file reference that "oh, hey, if we open this file we should do so with O_CLOEXEC"? What happens when you combine relative paths—does the new FileReference inherit the flag from the one it was constructed from? Etc. etc. So it's a tradeoff and not clearly better.
And none of this does anything for applications that make use of library code that opens its own files or sockets, without O_CLOEXEC
, and elsewhere in the application happens to fork child processes. So add on top of that needing to thread the argument through and/or add option properties to all the various XML, JSON, CSV, etc. parsing libraries that have their own wrappers to open a file and transparently translate the markup language.
Thanks for the discussion! This is important to have multiple views. I like the fact that other languages are doing this.
Clearly nothing is going to happen with this for Pharo 12...could someone move it to the Pharo 13 milestone and maybe we can revisit in a couple months?
Done
Child processes created through
#system:
or#resultOfCommand:
keep open copies of the file descriptors of the Pharo VM process. In the following example, this poses a problem: the second send of#bindTo:port:
signals a SocketError (“Address already in use”) because the ‘sleep’ process still has the socket open. The third send does not signal an error because by then the ‘sleep’ process has exited.The ‘sleep’ process’s open file descriptors can be listed through ‘lsof’ as done in the following example. The file descriptors include the one for the socket, as well as ones for the Pharo sources and changes files (from
#sourcesFileStream
and#changesFileStream
):There is a flag (
FD_CLOEXEC
) that can be set on a file descriptor (throughfcntl
) such that it will not remain open in the ‘sleep’ process. In Zinc issue #110, I gave a snippet that sets the flag on the Zinc server socket. Perhaps that should be generalized to a method for setting the flag on any file descriptor. Alternatively, there should be a variant of#system:
that takes an explicit file descriptor mapping for the child process, closing any file descriptor in the child process that is not mapped, something like the following: