remilapeyre / vault-acme

Mozilla Public License 2.0
94 stars 24 forks source link

Better Error Reporting and Logging #28

Open XDjackieXD opened 2 years ago

XDjackieXD commented 2 years ago

I had a case where one of my DNS servers wasn't accepting the DNS challenge record from the master and lego fails during testing of all servers. Instead of an error message that would suggest that there is something wrong with one of my DNS servers, I just get rpc error: code = Canceled desc = context canceled. (I spent a lot of time on debugging other issues during DNS challenge setup a while ago too where better error messages would've helped a lot). I don't have enough insight into the vault codebase (and only rudimentary golang knowledge) but if possible it would be amazing if error reporting could be improved in future releases.

remilapeyre commented 2 years ago

Hi @XDjackieXD, when Vault aborts the request because it took too long there is sadly not much information to return. Can you please try to set the ignore_dns_propagation parameter in your account? It should speed things up and we should have have better error message to get going.

XDjackieXD commented 2 years ago

Hi! Thanks for the response. I'm not a fan of disabling the dns propagation checks as they help to highlight problems early on but it would be nice if the default dns propagation timeout would be smaller than the vault request timeout so vault-acme can abort with a useful error message like "DNS propagation check failed: check for server x.y.z failed". It took me way to much time to figure out what's wrong with just the timeout in vault and enabling debug logging on a vault server is pretty annoying as I have to poke enough shard-holders to unlock the server after the restart.