Open hannahwhy opened 10 years ago
One potential problem: some people may be running Warriors in environments that forbid outgoing DNS queries to anything outside of a predefined set of DNS servers. (I've never heard of this, but it's definitely possible.)
I think this could be addressed by using the Warrior's resolver as the primary nameserver and the DHCP-provided nameservers as secondary nameservers. It would also be nice to log when this condition is detected, so that we can get some idea of how common this sort of thing is.
So I just saw this:
https://github.com/ArchiveTeam/warrior-code2/blob/master/warrior-install.sh#L41-L49
I was under the impression that dnsmasq
provided DNS resolution service on its own. Is this true? (If it is, that makes this behavior a mystery to me.)
I thought dnsmasq was just used for caching DNS requests.The virtual machine should either be passing the host's DNS settings or providing its own DNS server that forwards requests to the host's DNS settings. dnsmasq should be picking up these servers from the virtual machine.
Cross-reference: ArchiveTeam/seesaw-kit#28
In the wretch.cc grab, some Warriors are returning this sort of stuff in their wget.log WARC records:
The
website-unavailable.com
stuff is an OpenDNS "service" that redirects users to search pages on DNS lookup failure.This is a source of inconsistency in the Warrior that we can (and should) eliminate. DNS lookup errors ought to be reported (and recorded!) the same way across all grabbers.
To eliminate this problem, I propose that the Warrior run its own DNS resolver and cache and that the Warrior VM be set to use it. I prefer
djbdns
or the Debiandbndns
fork, but there are other good choices.