amp-dns incorrectly falls back to local resolvers with invalid DNS server

zafnz commented 7 years ago

When providing an DNS server that either is invalid, or supplied a hostname that doesn't resolve, amp-dns falls back to local resolves, rather than spits out an error. Since the target list is what we are testing against, this seems like a bad idea. (eg, are we testing that the client can resolve the -q query string, or are we testing the client can resolve the query string from the provided DNS servers?

Expected results: (note, the hostname does not resolve)

root@tut5:~# amp-dns -q google.com -- not-a-valid-hostname.example.com
WARN: DNS server provided is invalid.

Actual results:

root@tut5:~# amp-dns -q google.com -- not-a-valid-hostname.example.com

AMP dns test, 2 destinations, google.com IN A, DSCP default (0x0)
SERVER: localdns (8.8.8.8) 32535us
MSG SIZE sent: 39, rcvd: 39, opcode: QUERY, status: SERVFAIL
flags: qr ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

SERVER: localdns (8.8.4.4) 32388us
MSG SIZE sent: 39, rcvd: 39, opcode: QUERY, status: SERVFAIL
flags: qr ra; QUERY: 1, ANSWER: 0, AUTHORITY: 0, ADDITIONAL: 1

(It uses the local DNS servers instead, which, in this case is 8.8.8.8 and 8.8.4.4)

zafnz commented 7 years ago

Discovered the bug also affects valid hostnames that don't resolve, so have updated the bug description and example, demonstrating this isn't an edge case.

The cause is a bit "fun". dns_run takes the list of destinations from main() which calls it. main() takes the list of arguments provided after --, and attempts to resolve them with amp_resolve_add(). If that fails, it silently fails. This IMHO is wrong. If it fails, it should warn that a provided target has failed.

Realistically this comes down to what are we testing - are we testing that the amplet client can resolve at all, or can it resolve with the provided DNS servers (specifically falling back if none provided)?

brendonj commented 5 years ago

I think I've finally figured out what should be done here, though it took a while. If a destination fails to resolve then it should not be excluded from the list of destinations passed to the test. Instead it should be kept (just without a useful address) and reported on like all other destinations.

This would prevent the case where the DNS test sees no destinations and falls back to the local resolvers, but is also important for all the other tests too - if for some reason the probe can't resolve a target, it still needs to be reported in order to help differentiate it from cases where the probe wasn't running, or using a test schedule that didn't actually include that destination. Silently failing and not reporting anything hides these cases.

Couple of patches incoming:

update tests to accept targets with empty addresses and to report on them
update name resolution to return empty addresses for names that failed to resolve

I'll make a release between steps 1 and 2 to confirm that everything continues to work as normal (and time to update nntsc/ampy/ampweb as well), before making the changes that would allow unresolved destinations to actually appear.

brendonj commented 2 years ago

The supporting software (nntsc/ampy/ampweb/etc) should now deal with results for destinations that didn't resolve. Everything has been running smoothly with these changes for some time.

wandnz / amplet2

amp-dns incorrectly falls back to local resolvers with invalid DNS server #4