open-contracting / deploy

Deployment configuration and scripts
https://ocdsdeploy.readthedocs.io/en/latest/
Apache License 2.0
2 stars 3 forks source link

Hetzner DNS resolution errors #501

Closed jpmckinney closed 3 days ago

jpmckinney commented 1 month ago

Occurs with:

cc @yolile

jpmckinney commented 1 week ago

Copying Slack comments here:

I have put together a small spreadsheet showing the state of play right now. You can see the different, specific errors we are seeing when looking up the DNCP domain. Interestingly DNCP DNS lookups via OpenDNS are working. One theory I am working on is that OpenDNS host their recursive DNS servers in different locations to Google and Cloudflare. Any big public DNS server uses anycast networking passing our lookup to a physically closer system, perhaps the issue is between Finland and Paraguay - this is difficult to prove and pushing the edge of my expertise. Our self-hosted DNS server workaround works but I have taken it offline since OpenDNS works as well. I also had one outstanding issue with this, I couldn't get UDP DNS lookups to work, this is probably a firewall problem. (TCP DNS lookups worked fine however). https://docs.google.com/spreadsheets/d/1TlycadnTEdsrnaHH56gckwWc9jYAB0f1BZCRUpbMmIQ/edit?usp=sharing


Another mitigation we could do is hard-coding the DNS request in /etc/hosts . The system will prioritise /etc/hosts over live DNS mitigating this issue. The main problem is that if the DNS response changes we would need something to alert us so we can update this. If you are still seeing errors, and only on this one domain (contrataciones.gov.py), then this would be good to setup.

jpmckinney commented 3 days ago

Using OpenDNS seems to have resolved the issue.