StackExchange / dnscontrol

Infrastructure as code for DNS!
https://dnscontrol.org/
MIT License
3.13k stars 397 forks source link

NSONE: 524 Origin Time-out not handled as error #2627

Closed norman-zon closed 10 months ago

norman-zon commented 11 months ago

Describe the bug When getting a 524 Origin Time-out response from NS1, the provider handles it as a WARNING and dnscontrol exits successfully.

To Reproduce I can't figure out how to reproduce a 524.

Here is a log of one occurence:

[INFO: Diff2 algorithm in use. Welcome to the future!]
******************** Domain: myzone.de
WARNING: Error creating domain: PUT https://api.nsone.net/v1/zones/myzone.de: 524 <html>
<head><title>524 Origin Time-out</title><script src="/cdn-cgi/apps/head/BGkBeDlUJpHx3swQRyf58HOAico.js"></script></head>
<body bgcolor="white">
<center><h1>524 Origin Time-out</h1></center>
<hr><center>cloudflare-nginx</center>
</body>
</html>

Expected behavior A 524 Error should be handled as an error and lead to dnscontrol failing with a non-zero exit code

DNS Provider NS1

Additional context dnscontrol v4.4.1

tlimoncelli commented 11 months ago

CC @costasd (maintainer of the NS1 provider)

costasd commented 11 months ago

Thanks, will take a look!

costasd commented 11 months ago

So, looks like this ending up a warning is not ns1-specific (an error is actually emitted for such). The warning looks like it's a warning on purpose when ensuring a zone, so that it doesn't fail the run completely in case of multiple providers.

https://github.com/StackExchange/dnscontrol/blob/bfa9da98c8642073dffc7c7013bcd151b2d061ae/commands/previewPush.go#L209-L215

Looking at the code below the snippet, I see that it actually avoids attempting any further operations to the broken provider (such as records or registrar updates to add nameservers) - so it's safe.

Given that it's safe as it is, I don't see a lot of improvements to be made, besides maybe error on our way out during reporting something like Ensuring zone X in provider Y failed with message?

Would that improve it?

norman-zon commented 11 months ago

Would it be an option to use some other exit code than 0 or 1 for a "partial" success. Or run the others providers, but still exit with 1? In my case I would like to send a notification/alert from a CI run, in the above case.

costasd commented 11 months ago

I see. Makes sense needing to know about this. I've thought of an approach relative to what you suggest, I'll give it a try.