acmesh-official / acme.sh

A pure Unix shell script implementing ACME client protocol
https://acme.sh
GNU General Public License v3.0
39.62k stars 4.99k forks source link

AWS Route53 rate exceeded on certificate renewal cron #1506

Open pietervogelaar opened 6 years ago

pietervogelaar commented 6 years ago

Sometimes when executing the certificate renewal cron command, we experience Route53 rate exceeded limits.

/root/.acme.sh/acme.sh --cron --home /root/.acme.sh
Response error:<?xml version="1.0"?>
<ErrorResponse xmlns="https://route53.amazonaws.com/doc/2013-04-01/"><Error><Type>Sender</Type><Code>Throttling</Code><Message>Rate exceeded</Message></Error><RequestId>a3c61f46-3c84-11e8-8ee7-94mnbdhf</RequestId></ErrorResponse>

More information: https://docs.aws.amazon.com/Route53/latest/DeveloperGuide/DNSLimitations.html

This probably meant more than "Five requests per second per AWS account". I'm not sure what the proper fix is yet. Maybe sleep times somewhere? Does it ring a bell?

FernandoMiguel commented 6 years ago

how many certs are you renewing? i have a "couple" and never seen that message before

pietervogelaar commented 6 years ago

The command sees about 20 certificates. But if a certificate is skipped because it's not near expiration, I guess no API calls to Route 53 are executed?

Two of the twenty certificates were actually renew candidates. The first one succeeded, but the second one failed because of "Rate exceeded".

weehal commented 6 years ago

I have an identical problem. Its only one certificate but with 3 alt names.

Maybe its possible to add/remove multiple records in one POST?

ddarbyson commented 6 years ago

I have the same issue when Cron is running to manage 3 certificates.

[Sun Aug 26 00:00:06 EDT 2018] Response error:<?xml version="1.0"?>
<ErrorResponse xmlns="https://route53.amazonaws.com/doc/2013-04-01/"><Error><Type>Sender</Type><Code>Throttling</Code><Message>Rate exceeded</Message></Error><RequestId>XXX</RequestId></ErrorResponse>
ddarbyson commented 6 years ago

Well, here's a hacky workround...

Add a pre-hook to create a delay between requests to AWS API

/root/.acme.sh/acme.sh --home /root/.acme.sh --cron --pre-hook /root/.acme.sh/acme-sleep.sh

Here's the contents of acme-sleep.sh

#!/bin/sh
sleep 7
exit

...the idea here was understood from https://github.com/appscode/go-dns/blob/master/aws/aws.go#L48 where "Route 53 enforces an account-wide(!) 5req/s query limit."

By adding a delay between request to AWS API we are able to generate the certificates without any "Throttling Rate Exceeded" error messages.

I suppose this better then not working at all!

mabitt commented 6 years ago

@ddarbyson I tried your fix but had no success, my cert have about 20 domains in it.

I added a sleep 1 on the end of aws_rest() function. (dns_aws.sh file)

ddarbyson commented 6 years ago

@mabitt hmmm,, well I'm only registering 3 domains so my 7-second sleep solution worked for me. Not sure if qty of domains matters. I wouldn't think it would since AWS API rate limit is based on requests per second, not the number of requests in total.

benurb commented 6 years ago

@ddarbyson Your solution will not work for SAN certificates (aka. certificates for multiple domains) since the pre-hook is only executed once per certificate not once per domain. So @mabitt still runs into AWS' rate limit (and so do I). I can confirm that @mabitt 's solution works for now though.

Reiner030 commented 5 years ago

This issue is still a problem when using multiple zones/DNS entries for e.g. development and testing environments to allow only distinct hostnames or for users which needs to update their Route53 records (outside AWS and it's own certificate manager) for their non-aws instances/servers.

Like @mabitt I quick-fixed it with adding a small delay of 0.6 seconds (0.4 was too small; for huge lists it needs perhaps also to be increased) since the throttling problem is back in AWS eu-central-1 (it was gone last year after I had this problem 1,5 years also).

*** dnsapi/dns_aws.sh.orig      2018-12-07 02:15:03.231767186 +0000
--- dnsapi/dns_aws.sh   2019-02-28 17:04:09.995155303 +0000
*************** aws_rest() {
*** 336,340 ****
--- 336,341 ----
      fi
    fi

+   sleep .6
    return "$_ret"
  }

The "hacky workaround" by @ddarbyson looks also very nice - perhaps this could be included minimum in FAQ, better as file commited and documented as additional call?

Better would be a common sleep functionality given by parameter like existing:

  --dnssleep  [120]                  The time in seconds to wait for all the txt records to take effect in dns api mode. Default 120 seconds.

to have also an parameter

  --dnsdelay    [0]                  The time in seconds to wait between DNS requests. Default 0 seconds.

and bests would be to have a retry functionality for the DNS calls implemented?