socketry / async-dns

An asynchronous DNS resolver and server.
MIT License
96 stars 14 forks source link

Round-robin endpoints #21

Open brandonhilkert opened 1 year ago

brandonhilkert commented 1 year ago

In this code, I noticed the endpoints are iterated through in the same order each time. We're having some trouble with upstream forwarders and seem to be hitting some rate limits. We have interest in pursuing some caching, but wanted to confirm the behavior that it's going through the same loop each time and trying to the first endpoint in the list?

We're using the following Resolver:

  UPSTREAM = Async::DNS::Resolver.new(
    [
      [:udp, ENV["FORWARDER_2_IP"], 53],
      [:tcp, ENV["FORWARDER_2_IP"], 53],
      [:udp, ENV["FORWARDER_1_IP"], 53],
      [:tcp, ENV["FORWARDER_1_IP"], 53],
    ],
    timeout: 5.0,
  )

I assume this means FORWARDER_2_IP is taking the bulk of it?

There's a branch that'll dump UDP servers:

# We select the protocol based on the size of the data:
if @packet.bytesize > UDP_TRUNCATION_SIZE
    @endpoints.delete_if{|server| server[0] == :udp}
end

But assuming they aren't removed from the list, would FORWARDER_2_IP be hit twice ?

ioquatix commented 1 year ago

Yes, we could implement some kind of round-robin approach.

brandonhilkert commented 1 year ago

I’m happy to put together a PR. Would you imagine it to be configurable with the resolver options?

ioquatix commented 1 year ago

The resolver interface is quite old, and while it's currently specified as an array, it would make more sense to have an opaque policy/interface for this. Maybe we can just sub in an object that responds to each and change @endpoints.delete_if{|server| server[0] == :udp} to use a proper method, i.e.

class Async::DNS::Servers
  def each(packet_size)
    yield endpoint
  end
end

Without reviewing the exact code, I don't know the best way, but I think something like this would make sense.

brandonhilkert commented 1 year ago

I think it may be more complex than that b/c of the Request class and each endpoint is wrapped in Async::IO::Endpoint:

class Request
    def initialize(message, endpoints)
        @message = message
        @packet = message.encode

        @endpoints = endpoints.dup

        # We select the protocol based on the size of the data:
        if @packet.bytesize > UDP_TRUNCATION_SIZE
            @endpoints.delete_if{|server| server[0] == :udp}
        end
    end

    attr :message
    attr :packet
    attr :logger

    def each(&block)
        Async::IO::Endpoint.each(@endpoints, &block)
    end

    def update_id!(id)
        @message.id = id
        @packet = @message.encode
    end
end

I don't want to rock the boat here, but the whole thing feels like it could use some work. I didn't find update_id! to be used at all and it's not clear to me that methods off Async::IO::Endpoint are actually used in this package, which makes me wonder if it's necessary at all.

brandonhilkert commented 1 year ago

I see that the class methods from https://github.com/socketry/async-io/blob/main/lib/async/io/endpoint/each.rb here used on Async::IO::Endpoint, so you can ignore that comment.

ioquatix commented 1 year ago

This code is over 10 years old, and it was designed to be compatible with the original interface which isn't the most ideal interface. I'm happy to take a look. Specifying endpoints in the form [:tcp|:udp, address, port] is not the best format... I consider the each code to be pretty old and less than ideal, but it was one of the original interfaces I created as part of the original RubyDNS code.

Given that this is endpoint specific, i.e. the maximum UDP size should depend on IPv4 vs IPv6... we might want a better way to encode this, or just simply ignore such a limit and try it, then fall back to TCP if no response is received... In theory UDP can support much larger packets, they are just less likely to be delivered, but if it goes via, say, localhost to systemd-resolved, that won't really be an issue - it's a tricky implementation detail.

brandonhilkert commented 1 year ago

I'm less familiar with those types of details so it's easy for me to sit back and think about what would work for us, but I totally get it's more complicated than that. I'm happy to help, but fearful it won't be useful b/c of that. I suppose I could just monkey-patch things for now to randomize the array each time which would solve our need, but I'm also happy to help update if you think that's a good fit.