crazy-max / ddns-route53

Dynamic DNS for Amazon Route 53 on a time-based schedule
https://crazymax.dev/ddns-route53/
MIT License
296 stars 38 forks source link

stuck in "WRN Already running" #911

Closed benedikt-bartscher closed 1 year ago

benedikt-bartscher commented 1 year ago

Behaviour

Steps to reproduce this issue

  1. restart your fw/router
  2. add some luck
  3. ddns-route-53 is stuck
  4. after restarting ddns-route-53 it instantly updates my DNS records

Expected behaviour

ddns-route53 is not stuck

Actual behaviour

ddns-route53 is stuck

Configuration

running in docker

    environment:
      - "TZ=Europe/Paris"
      - "SCHEDULE=*/1 * * * *"
      - "LOG_LEVEL=info"
      - "LOG_JSON=false"
      - "DDNSR53_CREDENTIALS_ACCESSKEYID=******************"
      - "DDNSR53_CREDENTIALS_SECRETACCESSKEY=******************"
      - "DDNSR53_ROUTE53_HOSTEDZONEID=******************"
      - "DDNSR53_ROUTE53_RECORDSSET_0_NAME=******************."
      - "DDNSR53_ROUTE53_RECORDSSET_0_TYPE=A"
      - "DDNSR53_ROUTE53_RECORDSSET_0_TTL=60"

Logs

i am sorry, i did not have loglevel set to debug. i will try to reproduce and update the logs later.

ddns-route53  | Tue, 24 Jan 2023 00:56:01 CET INF Next run in 59 seconds 952 milliseconds (2023-01-24 00:57:00 +0100 CET)
ddns-route53  | Tue, 24 Jan 2023 00:58:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 00:59:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:00:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:01:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:02:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:03:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:04:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:05:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:06:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:07:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:08:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:09:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:10:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:11:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:12:00 CET WRN Already running
ddns-route53  | Tue, 24 Jan 2023 01:12:07 CET WRN Caught signal terminated
crazy-max commented 1 year ago

Hum it seems the scheduler is stuck. Debug logs would help.

Can you also set SCHEDULE=*/10 * * * * and let me know if you still have the issue?

benedikt-bartscher commented 1 year ago

I just retried with env LOG_LEVEL=debug but the output stays the same. How do i enable debug? I can reproduce this problem on every firewall restart. I think setting the cronjob to 10min just reduces the chance of this error but does not solve it. A better way would be to just kill stuck jobs before every run. Or add a reasonable timeout.

benedikt-bartscher commented 1 year ago

maybe we should expose these settings via env vars?: https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/retries-timeouts/

benedikt-bartscher commented 1 year ago

i temporarily added os.Exit(1) after log.Warn().Msg("already running") to fix this

crazy-max commented 1 year ago

I just retried with env LOG_LEVEL=debug but the output stays the same.

You should have some DBG logs such as:

ddns-route53    | Mon, 04 Jan 2021 19:15:55 EST INF Starting ddns-route53 version=2.2.0
ddns-route53    | Mon, 04 Jan 2021 19:15:55 EST INF Configuration loaded from file: /ddns-route53.yml
ddns-route53    | Mon, 04 Jan 2021 19:15:55 EST DBG No DDNSR53_* environment variables defined
ddns-route53    | Mon, 04 Jan 2021 19:15:55 EST DBG {
ddns-route53    |   "credentials": {
ddns-route53    |     "accessKeyID": "###########################",
ddns-route53    |     "secretAccessKey": "###########################"
ddns-route53    |   },
ddns-route53    |   "route53": {
ddns-route53    |     "hostedZoneID": "###########################",
ddns-route53    |     "recordsSet": [
ddns-route53    |       {
ddns-route53    |         "name": "#######.#######.",
ddns-route53    |         "type": "A",
ddns-route53    |         "ttl": 180
ddns-route53    |       }
ddns-route53    |     ]
ddns-route53    |   }
ddns-route53    | }

maybe we should expose these settings via env vars?: https://aws.github.io/aws-sdk-go-v2/docs/configuring-sdk/retries-timeouts/

I'm not sure this is an issue when sending a request to AWS. Do you still repro?

pkpfr commented 1 year ago

I also have the same issue. Added a log.Fatal() to fix it.

Aug 24 18:15:14 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:14 BST INF WAN IPv6 has not changed for th1.passkit.com. record set
Aug 24 18:15:14 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:14 BST INF No Route53 record set to update. Skipping...
Aug 24 18:15:14 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:14 BST INF Next run in 29 seconds 998 milliseconds (2023-08-24 18:15:32 +0100 BST)
Aug 24 18:15:32 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:32 BST INF Current WAN IPv4: 171.5.11.77
Aug 24 18:15:32 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:32 BST INF Current WAN IPv6: 2403:6200:8876:cfdc:2ca:f78e:1649:e9dd
Aug 24 18:15:34 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:34 BST INF WAN IPv4 has not changed for th1.passkit.com. record set
Aug 24 18:15:34 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:34 BST INF WAN IPv6 has not changed for th1.passkit.com. record set
Aug 24 18:15:34 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:34 BST INF No Route53 record set to update. Skipping...
Aug 24 18:15:34 passkit ddns53[616]: Thu, 24 Aug 2023 18:15:34 BST INF Next run in 29 seconds 999 milliseconds (2023-08-24 18:16:02 +0100 BST)
Aug 24 18:16:05 passkit ddns53[616]: Thu, 24 Aug 2023 18:16:05 BST INF Current WAN IPv4: 171.5.13.7
Aug 24 18:16:32 passkit ddns53[616]: Thu, 24 Aug 2023 18:16:32 BST WRN Already running
Aug 24 18:17:02 passkit ddns53[616]: Thu, 24 Aug 2023 18:17:02 BST WRN Already running
Aug 24 18:17:32 passkit ddns53[616]: Thu, 24 Aug 2023 18:17:32 BST WRN Already running
Aug 24 18:18:02 passkit ddns53[616]: Thu, 24 Aug 2023 18:18:02 BST WRN Already running
Aug 24 18:18:32 passkit ddns53[616]: Thu, 24 Aug 2023 18:18:32 BST WRN Already running
Aug 24 18:19:02 passkit ddns53[616]: Thu, 24 Aug 2023 18:19:02 BST WRN Already running
crazy-max commented 1 year ago

1064 should mitigate this issue. I will look at making backoff delay customizable if necessary.

Edit: added max backoff delay opt, see https://crazymax.dev/ddns-route53/usage/cli/#options