ArchiveTeam / wpull

Wget-compatible web downloader and crawler.
GNU General Public License v3.0
545 stars 77 forks source link

resolve_dns hook lacks information on IPv4/IPv6 preference and can't return more than one result #461

Open JustAnotherArchivist opened 3 years ago

JustAnotherArchivist commented 3 years ago

The resolve_dns hook is useful to override DNS to retrieve content from a server that used to be pointed at by a domain name. However, this is limited primarily by two factors:

The obvious resolution for the latter is to add support for a sequence (list/tuple/set/etc.) return value. While alternate hostnames could perhaps be useful sometimes in this context, this is a very rare use case and adds significant complexity, so I'd propose that all entries must be IP addresses in such a sequence. --rotate-dns could be implemented by random selection.

I see two possible ways of implementing the IPv4/IPv6 preference:

  1. Pass the preference as an argument to the hook. The hook would have to check the value and only return IP addresses of the corresponding type.
  2. Keep the preference out of the hook. The hook simply returns one sequence containing both IPv4 and IPv6 addresses, and the Resolver filters the results based on the preference.

I'm slightly leaning towards the latter as it keeps the hook interface simpler, and I can't think of a use case where actually knowing the preference in the hook would be important.

On a related note, it appears that the code always makes the round trip to dnspython or the system resolver, even if the hook already returned an IP address. This seems unnecessary. Instead, dns.inet.is_address could be used to check that and return directly.