Open Jarred-Sumner opened 9 months ago
Would be nice if you could make true IPC available, so that we can share data with bun process's that aren't children see also #11683
also kinda unrelated to this issue but node-ipc doesn't even work on bun - #12712
EDIT: I found an alternative to node-ipc called zeromq but sadly it currently doesnt work on bun either, see ~#12711~ #12746
Is this feature becoming stable any time soon? I know bun offers mmap but this dedicated IPC module feels better
This would honestly be huge for realtime processing applications, @Jarred-Sumner.
We spent a few days trying to find a memory leak in our code when, in the end, it was because we were sending thousands of buffers to a child process (our fault for not looking at the underlying IPC implementation or finding this issue first).
As a temp solution, we're using semaphores from the async-mutex to avoid overloading the domain socket.
We're probably not going to work on this very soon, but I wanted to write some thoughts out while they're in my head
IPC is currently implemented using unix domain sockets, and mostly relying on structuredClone() to serialize & then later deserialize.
This works great in many scenarios, but it's certainly not optimized. Every message is cloned into a temporary buffer, and then that temporary buffer is immediately written to the socket (which is cloned). Reading the message also clones it again. This is too many copies.
I think it'd make sense to specialize on a couple kinds of messages:
proc.send(new ArrayBuffer(42))
proc.send(new Buffer(42))
supporting various typed array types + Bufferproc.send(JSON.stringify("abc"))
proc.send(JSON.stringify("abc❤️"))
There are a number of options we can do to make IPC fast.
Linux: memfd_create
memfd
supports sealing which lets us make read-only in-memory file descriptors. We can have a reader end and a writer end this way. eventfd can be used for signaling readiness across processes. memfd is good here because we can skip the read() and write() system calls + copying in/out of kernel space.What we can do here, not huge edition: 1) Make the memfd expand maximum up to 2 MB or so 2) mmap() the memfd and copy strings and other data directly into it 3) Write to the eventfd to signal the other process to read it 3) In the other process, we keep a MAP_SHARED mapping of the memfd 4) Clone the string, arraybuffer, etc and call it a day.
Before: cloning to the StructuedSerialized format, then cloning to the unix domain socket, then reading from the unix domain socket, then decoding from the unix domain socket, and cloning one last time again.
After: One clone to the memfd, and one clone in the other process to the
WTF::String
orJSC::ArrayBuffer
.What we can do here, huge edition: 1) Given a large ArrayBuffer, string, etc create a fresh new memfd 2) Seal it 3) Send the memfd via sendmsg 4) Write to the eventfd to signal the other process to read it 5) [other process] Receive the memfd via recvmsg 6) mmap() the memfd into a MAP_PRIVATE copy 7) Use WTF::ExternalStringImpl, or WTF::ArrayBuffer::* method to unmap and close the file descriptor once the string or arraybuffer is finalized
Before: cloning to the StructuedSerialized format, then cloning to the unix domain socket, then reading from the unix domain socket, then decoding from the unix domain socket, and cloning one last time again.
After: One clone to the memfd
The tradeoff here is that the size of the content has to be large enough to justify the cost of the unique memory mapping per message as well as the cost of keeping a file descriptor open for potentially a long time. That's why you probably only want to do the 1 clone approach for very large messages.
Darwin:
mach_vm_copy
and machportsTODO: expand on this