Closed zafnz closed 2 years ago
Discovered the bug also affects valid hostnames that don't resolve, so have updated the bug description and example, demonstrating this isn't an edge case.
The cause is a bit "fun". dns_run takes the list of destinations from main() which calls it. main() takes the list of arguments provided after --, and attempts to resolve them with amp_resolve_add(). If that fails, it silently fails. This IMHO is wrong. If it fails, it should warn that a provided target has failed.
Realistically this comes down to what are we testing - are we testing that the amplet client can resolve at all, or can it resolve with the provided DNS servers (specifically falling back if none provided)?
I think I've finally figured out what should be done here, though it took a while. If a destination fails to resolve then it should not be excluded from the list of destinations passed to the test. Instead it should be kept (just without a useful address) and reported on like all other destinations.
This would prevent the case where the DNS test sees no destinations and falls back to the local resolvers, but is also important for all the other tests too - if for some reason the probe can't resolve a target, it still needs to be reported in order to help differentiate it from cases where the probe wasn't running, or using a test schedule that didn't actually include that destination. Silently failing and not reporting anything hides these cases.
Couple of patches incoming:
I'll make a release between steps 1 and 2 to confirm that everything continues to work as normal (and time to update nntsc/ampy/ampweb as well), before making the changes that would allow unresolved destinations to actually appear.
The supporting software (nntsc/ampy/ampweb/etc) should now deal with results for destinations that didn't resolve. Everything has been running smoothly with these changes for some time.
When providing an DNS server that either is invalid, or supplied a hostname that doesn't resolve, amp-dns falls back to local resolves, rather than spits out an error. Since the target list is what we are testing against, this seems like a bad idea. (eg, are we testing that the client can resolve the -q query string, or are we testing the client can resolve the query string from the provided DNS servers?
Expected results: (note, the hostname does not resolve)
Actual results:
(It uses the local DNS servers instead, which, in this case is 8.8.8.8 and 8.8.4.4)