WebAssembly / wasi-sockets

WASI API proposal for managing sockets
231 stars 20 forks source link

What's in a socket? #44

Open SoniEx2 opened 1 year ago

SoniEx2 commented 1 year ago

Apologies for the uh, vague issue, but not sure of a better way to do this. We want fairly open-ended discussion about what a socket is. A starting point might be to look at how different existing systems handle sockets - say, BSD vs microkernels, and how that affects things like socket options. Or indeed existing sandboxing attempts.

Personally we don't have any resources on hand about this stuff but opening an issue is a good way to get poked about looking into it. :v

badeend commented 1 year ago

I don't worry too much about the (historic & broader) definition of a "socket" and what it is "supposed" to do. Rather, I focus on: How can we provide a WASI native interface that provides internet connectivity to applications in such a way that the majority of real-world existing BSDSocket applications need little to no modifications.

SoniEx2 commented 1 year ago

alright. do you think a compositional socket model would be appropriate for wasi? you ever think about how TCP and UDP are the only transport protocols and get upset that something like SCTP could never be implemented?

personally we think our ideal socket approach would be having tcp/ipv4, tcp/ipv6, etc sockets fully separate. kinda like a microkernel, with each network layer being its own userspace daemon. or at least, being able to pretend it is.

that way, sctp would be simply another userspace daemon. e.g. the IPv6 layer would take IPv6 packets, find the type (tcp vs udp vs sctp, effectively a service identifier), and forward that to the appropriate userspace daemon, and the sctp/ipv6 daemon would take that packet, look at the port number (which is another service identifier), and forward that to the appropriate userspace app.

we think exposing something like this instead of the traditional BSD sockets would be pretty neat. but wasm being wasm, we can use function calls instead of unix sockets to communicate between the application and the userspace daemon. in fact there wouldn't be an userspace daemon in practice. since the tcp daemon would wrap the ipv6 stuff and you'd use it as a tcp/ipv6 daemon, wasi could just expose separate tcp/ipv6 and udp/ipv6 interfaces and maybe define a composition-based design pattern if someone ever wants to provide sctp to wasm. (in other words... wasi-socket should not only provide tcp/ipv6 and udp/ipv6, but also a design pattern for future transport protocols and even raw sockets.)

badeend commented 1 year ago

This modularization already exists to some degree. This proposal includes UDP & TCP as fully independent interfaces. Additionally, there nothing stopping a future proposal from adding a third separate interface for SCTP, and a fourth for RAW sockets, etc.

From the README:

This proposal is not POSIX compatible by itself. The BSD sockets interface is highly generic. The same functions have different semantics depending on which kind of socket they're called on. The man-pages are riddled with conditional documentation. If this had been translated 1:1 into a WASI API using Interface Types, this would have resulted in a proliferation of optional parameters and result types.

Instead, the sockets API has been split up into protocol-specific modules. All BSD socket functions have been pushed into these protocol-specific modules and tailored to their specific needs. Functions, parameters and flags that did not apply within a specific context have been dropped.


You mention tcp/ipv4 and tcp/ipv6 existing as separate modules. I did tinker with this idea too, but ultimately didn't think it was worth it;

SoniEx2 commented 1 year ago

ah yeah. we mean, in a real microkernel you'd have a single tcp daemon, so dualstack sockets would still work fine there. but yeah.

badeend commented 1 year ago

Is there anything else you wanted to discuss?

SoniEx2 commented 1 year ago

not particularly. we still think it's important to look back on existing implementations and understand their pain points and where they break down, but yeah we think that's about all there is to it really.

we don't know tho. maybe there's more stuff we haven't considered?