[`wasi:sockets/ip-name-lookup#resolve_addresses`] way to track name to `IpAddress` resolution

rylev commented 11 months ago

My use case is essentially to allow any socket operation as long as the IP addressed used for that operation was resolved in the last call to wasi:sockets/ip-name-lookup#resolve_addresses.

Given the changes in https://github.com/bytecodealliance/wasmtime/pull/7662 this would stand to reason that somehow I need the ability to have a list of resolved names to IP addresses in the closure passed to socket_addr_check.

This could be accomplished in several ways:

Wasmtime passes such a list directly as an argument to the SocketAddrCheck closure.
Wasmtime gives a way for the embedder to query what the list is at any time.
Wasmtime allows the embedder to hook into DNS resolution and be called back with any (name, resolved IP address) pairs
Other ways?

I think I might lean towards option 3. Perhaps this callback could be in the form Fn(&str, IpAddress) -> Result<(), ErrorCode> which would be called anytime resolve_next_address is about to return a Ok(Some(addr)) to the guest. This would allow the embedder to do the following things:

Observe DNS resolution as it happens (needed to unlock my use case)
Change resolution so that a name resolves to a different set of IP addresses
Return an ErrorCode error from resolve_next_address

If we can agree on an approach, I'm happy to do the implementation.

alexcrichton commented 11 months ago

I think this issue definitely makes sense, and this got me thinking a bit too. Reading over this issue and focusing solely on its text I'd also lean towards option 3 you've proposed but that quickly runs into another issue I think. The base problem is that Rust doesn't do well when building up a callback-style API with multiple callbacks. For example the "this got resolved" callback is distinct from the "is this socket address valid" callback. These two callbacks don't share state meaning that you'd have to use Rc/Arc. With mutability you're forced to also use RefCell/Mutex. Given that these are used in a Send/Sync context, that's requiring Arc<Mutex<...>> which isn't great because there's no actual multithreading going on here.

Given that problem it's prompted me to step back a bit. We solved this in Wasmtime for host functions by passing &mut T (effectively) to all host functions. That's not possible here though because a T in Store<T> isn't availble from WasiCtx. This means that I don't think there's an easy solution to this problem.

So that leads me to a few possible alternatives:

Implement (3) as is, use Arc<Mutex<_>>, be a little sad on the inside but things are working.
Add a type parameter to WasiCtx, such as WasiCtx<U>. Pass &mut U to the callbacks configured on WasiCtx to have "shared state".
Add a trait object instead to WasiCtx, something like Box<dyn WasiCustomBehavior>. Move these callbacks to trait methods on WasiCustomBehavior and then the shared state is &mut self.

My personal preference given those options would be to remove socket_addr_check and instead have something like:

impl WasiCtxBuilder {
    pub fn socket_config(&mut self, config: impl WasiSocketConfig) -> &mut Self;
}

pub trait WasiSocketConfig: Send + Sync + 'static {
    fn socket_addr_check(&mut self, addr: &SocketAddr, use_: SocketAddrUse) -> bool { true }
    fn resolved_addr(&mut self, name: &str, addr: &IpAddr) -> bool { true } // or bikeshed this signature
}

This trajectory of design however feels like it gets a bit silly taken to the limit though. For example we've already got traits representing address lookup, they're just generated by wit-bindgen. Technically there's nothing stopping you from having a non-wasmtime-wasi-based implementation and using add_to_linker yourself. Adding a second WasiSocketConfig is a whole extra trait on top of this preexisting trait. At that point why limit it to sockets? Why not put all of WasiCtxBuilder behind a trait effectively? For example WasiCtxBuilder could be a builder-style version of creating a context that implements a new Wasi trait, and all add_to_linker is defined in terms of T: Wasi. That way embeddings could have a custom T: Wasi and run with that.

Ok that's a lot of thoughts, not all of which you necessarily asked for on this issue. I think though what I might recommend for the near-term is something like:

No changes to wasmtime-wasi
Your embedding would add_to_linker for the ip-name-lookup interfaces and hook into the results there
State would be shared with socket_addr_check with an Arc<Mutex<_>>

That should all work today as-is, albeit not in a pretty fashion.

Looking more towards the future I think I'd personally prefer to consider a trait Wasi { ... } style approach. All the existing implementations would be defined for T: Wasi rather than just WasiCtx and that way we could decorate as many callbacks and hooks as we want on trait Wasi and have them "easily configurable" for embeddings.

badeend commented 11 months ago

Hi @rylev . I raised a similar issue 4 days ago. Specifically the "Domain-based policy strategy" section from that issue.

Just wanted to make sure we're not about to do duplicate work.

badeend commented 11 months ago

@alexcrichton Option 4(?):

Create a specialized Wasi***View per package/interface and place the extension points on that. Similar to WasiHttpView. No Boxes or Mutexes required.

Example for TCP:

pub trait WasiTcpView: WasiView {
    fn check_allowed_tcp(&mut self) -> std::io::Result<()> {
        Err(Errno::ACCESS.into())
    }
    fn check_tcp_addr(&mut self, addr: &SocketAddr, reason: SocketAddrUse) -> std::io::Result<()> {
        Err(Errno::ACCESS.into())
    }
}

- impl<T: WasiView> crate::preview2::host::tcp::tcp::HostTcpSocket for T {
+ impl<T: WasiTcpView> crate::preview2::host::tcp::tcp::HostTcpSocket for T {

alexcrichton commented 11 months ago

That's probably the best out of all ideas I think actually!

bytecodealliance / wasmtime

[`wasi:sockets/ip-name-lookup#resolve_addresses`] way to track name to `IpAddress` resolution #7694