rust-lang / futures-rs

Zero-cost asynchronous programming in Rust
https://rust-lang.github.io/futures-rs/
Apache License 2.0
5.39k stars 625 forks source link

DNS resolution #58

Closed sbstp closed 8 years ago

sbstp commented 8 years ago

This is a feature that's missing from mio and futures-rs. All the APIs take a SocketAddr, and do not offer a way of resolving host names. The lookup_host method from the standard library is only available on nightly, making it "technically" impossible to resolve host names on stable. I found a hack around this using TcpStream::connect followed by a call to peer_addr() but it's far from ideal and it's not asynchronous.

I think a really simple implementation could be offered using a thread pool and a call to getaddrinfo. Eventually there could be a better solution such as bindings to the getdns library, or a home made solution.

alexcrichton commented 8 years ago

Yeah I would love to have asynchronous DNS resolution, it's definitely a missing feature! Right now you can actually do this on stable Rust via to_socket_addrs and the SOCKSv5 proxy has an example of this. That example also does the work on a thread pool.

In general the pieces should be there in the sense of:

In the future though I'd love to integrate with something like rotor-dns or c-ares, as they'd slot in very nicely!

sbstp commented 8 years ago

Gee, I feel stupid, I totally forgot about the ToSocketAddr trait. I haven't been using it because it may return addresses that don't work, and you have to try many addresses and figure out which one works. It would be nice if that process could be implemented to the futures crate too.

alexcrichton commented 8 years ago

It's true yeah, although I'm not actually sure what the best practice is for a DNS query that returns multiple addresses. If that happens should we issue a bunch of parallel TCP connects and select the first one that returns? Or should we try each one in series with a particular timeout?

sbstp commented 8 years ago

Sequential could be slow (although it is the current behavior of std::net), and parallel could "spam" the server if all the addresses end up working. Perhaps the loop could have a setting describing how it should try to connect, and the user could pick the appropriate one for their use. Perhaps use sequential as a default, and if the users need performance, they can choose parallel.

MarkusJais commented 8 years ago

I agree with sbstp. Best to have an option to pick the appropriate strategy. It depends on the individual requirements of an application, network settings, etc. If a call "spams" the server some network security tools might block access completely.

carllerche commented 8 years ago

Whatever the DNS solution turns out to be, I would like to try to coordinate w/ Tokio as there is a need there as well.

alexcrichton commented 8 years ago

Indeed! To me this is sort of a "quest issue" right now where it's less so about how to expose it but moreso how to implement it. First we need to find an appropriate implementation!

sbstp commented 8 years ago

Here's a draft of what the API could look like.

I would create a futurs-dns crate that offers a variety of resolvers. For now we can provide a thread-pool based resolver, and eventually use rotor-dns or c-ares to provide an asynchronous resolver, as Alex mentionned.

Obviously the various resolvers would have to conform to an interface, which might look like this:

trait Resolver {
  fn resolve(&self, hostname: &str) -> BoxFuture<Vec<IpAddr>, io::Error>
}

A trait similar to ToSocketAddrs will be needed to allow various representations of an endpoint. However, unlike ToSocketAddrs, this trait will not perform DNS resolution. It will simply indicate (via the Endpoint enum) that a hostname was used and that DNS resolution should be performed on it.

trait ToEndpoint {
    fn to_endpoint() -> io::Result<(Endpoint, u16)>;
}

enum Endpoint<'a> {
    Hostname(&'a str),
    SocketAddr(SocketAddr),
}

If to_endpoint() returns the Hostname variant, DNS resolution has to be performed. Otherwsie, the socket can connect instantly to the SocketAddr. Here's what tcp_connect might look like with the new interface.

fn <T>tcp_connect(endpoint: T) -> IoFuture<TcpSocket> where T: ToEndpoint {
    match endpoint.to_endpoint() {
        Endpoint::Hostname(host) => {
            // perform dns resolution
            // try to connect to the addresses returned by `resolve`
        }
        Endpoint::SocketAddr(addr) => {
            // connect directly
        }
    }
}

Finally, the futures-mio loop should have settings regarding which resolver to use and how to perform connection (sequential / parallel).

Thoughts?

alexcrichton commented 8 years ago

Sounds pretty reasonable to me! I agree that the event loop would likely have a global resolver for functions like handle.tcp_connect, and then that'd be swappable to allow for various kinds of resolvers.

Now we just need some implementations :)

sbstp commented 8 years ago

I'd love to write a PR for this. Just wanted some feedback before starting.

alexcrichton commented 8 years ago

Perhaps we can start out with a simple implementation of just using futures-cpupool with the to_socket_addrs trait? We may want to avoid adding traits to futures-mio just yet but we could at least get "make a TCP connection to this hostname" methods up and running pretty soon.

That is, we could provide a concrete resolver, and that resolver could have tcp_connect methods as well as extension traits for types like UdpSocket to work with hostnames as well as socket addresses.

sbstp commented 8 years ago

Yeah I'll use futures-cpupool + to_socket_addrs. I would add the structs/traits/enums in a futures-dns and have the mio crate depend on it. That way people can use the resolver without mio.

I'm not sure if an extension trait can be used, because we need to keep the CpuPool object around.

alexcrichton commented 8 years ago

For now we may want the dependency the other way (dns depending on mio) so eventually we can implement dns with UDP in mio itself

sbstp commented 8 years ago

If it's done this way, it won't be possible for the mio loop to use an arbitrary resolver via the Resolver trait. If futures-mio depends on futures-dns the UDP resolver can be implemented in the futures-mio crate.

alexcrichton commented 8 years ago

Indeed! I'm thinking we should prototype support outside of the futures-mio crate first, and then when we're happy with it we can move it in.

carllerche commented 8 years ago

I would suggest making the trait take an associated type for the return value. This would allow it to be more globally usable. You could define the loop global version to have a boxed return type.

bluejekyll commented 8 years ago

I'm working on adding futures/streams to trust-dns. I have some thoughts that i'd love some feedback on.

1) we want to support more than just A record lookup, definitely need AAAA as well, for the "basic" ip lookup.

2) in addition, we need to support other record types, like SRV for instance.

3) CNAME records are obviously very common usage, should we support implicit chained lookups to resolve CNAME, probably yes.

4) I want to support a secure version for my own needs that will support client side DNSSec validation.

Given these requirements, does anyone have a strong preference for what this generic resolver trait would look like?

sbstp commented 8 years ago

I don't know if the Resolve trait should have an API this low level. 1) and 3) should definitely be supported, but they can be implemented using the current API.

bluejekyll commented 8 years ago

I think 3 is probably straight forward. I can give people an internal interface for 2&4.

But for 1, how should we do AAAA and A? In my client code I am planning on establishing connections based on the IP of the server, i.e. if the IpAddr for the NameServer is v4, establish a udp_client for v4, if v6 use v6 (obviously).

We could do the same with A vs. AAAA, i.e. base it off the NameServer address, but this might not be correct given that some NameServers may only be working over ipv4. I know many OSes first query AAAA and then fall back to A. I guess, how much logic do we want in the resolver, vs. requiring the consumer of the interface to make some of these decisions.

sbstp commented 8 years ago

Right now, the Resolver can return multiple IpAddrs. It's up to the connection code to find one that works. Currently my tokio-dns crate offers both parallel and sequential algorithms. It up to the user to use the connection mode they prefer.

I think that the resolver should follow CNAMEs, then request AAAA & A records. Returning multiple IpAddrs is an appropriate way of doing things, and is similar to getaddrinfo.

alexcrichton commented 8 years ago

@bluejekyll those requirements definitely sound good to me for a robust dns crate, but the tokio-dns crate that @sbstp is pioneering is a little different I think. Rather I think that tokio-dns is intended to be DNS resolution "for the masses" in the sense that it contains the functionality necessary for 90% of requests, which is typically just "I'd like to connect to this hostname/port" or "I'd like to resolve this hostname to some IP addresses".

For the remaining functionality, which I'm sure will actually be quite common in some circumstances, I'd expect the trustdns bindings to provide far more granular configuration and APIs. An implementation of a trait in tokio-dns could be layered on top, but it would also provide a suite of other functionality (like you mentioned in 2/4).

Unfortunately the maximum for tokio-dns is probably about what to_socket_addrs can give us right now. That is, we probably need to always be able to implement that trait with the standard library's to_socket_addrs, but that doesn't meant that more flavorful implementations can't exist!

I was talking with @aturon yesterday about how we might integrate DNS resolution with the event loop itself, and we're still a little up in the air.

bluejekyll commented 8 years ago

@sbstp agree.

@alexcrichton, yeah, I've been looking at the nuts and bolts so long, I've almost forgotten how most people use DNS ;) All of this makes sense, I just want to make sure that I have the right context such that I can provide a similar interface for people to use.

I think it would be interesting (at least for me) to give people some options on how DNS is used in to_socket_addrs or tokio-dns. Things like being able to add an in process cache rather than relying on external resolvers for example could be really powerful.

alexcrichton commented 8 years ago

Oh yeah I think we definitely want pluggable backends at a bare minimum. It'd be cool to then have a number of "dns middleware" layers like process caches or the like!

Also to be clear, currently tokio-core doesn't do any DNS resolution. It's still deferred to the user to figure out how to do that, but I'd very much like that to change!

alexcrichton commented 8 years ago

Ok I'm going to close this for now in favor of https://github.com/sbstp/tokio-dns now that it exists.