zodern / meteor-up

Production Quality Meteor Deployment to Anywhere
http://meteor-up.com/
MIT License
1.27k stars 281 forks source link

Letsencrypt rateLimited 429 error #1211

Closed timsun28 closed 3 years ago

timsun28 commented 3 years ago

Mup version 1.5.2 (but I got this error around 4 days ago on 1.5.0.

Mup config

{
  "servers": {
    "one": {
      "host": "1.2.3.4",
      "username": "root",
      "pem": "~/.ssh/pem"
    }
  },
  "app": {
    "name": "my-app",
    "path": "../",
    "servers": {
      "one": {}
    },
    "buildOptions": {
      "serverOnly": true
    },
    "env": {
      "ROOT_URL": "https://subdomain.subdomain.host.com",
      "MONGO_URL": "mongodb://mongodb:27017/my-app",
      "MONGO_OPLOG_URL": "mongodb://mongodb/local",
      "VIRTUAL_HOST": "subdomain.subdomain.host.com",
      "HTTPS_METHOD": "redirect",
      "LETSENCRYPT_HOST": "subdomain.subdomain.host.com",
      "LETSENCRYPT_EMAIL": "email@domain.com",
      "VIRTUAL_PORT": 3000,
      "HTTP_FORWARDED_COUNT": 1
    },
    "docker": {
      "image": "abernix/meteord:node-12.16.1-base",
      "stopAppDuringPrepareBundle": true,
      "imagePort": 3000,
      "args": [
        "--link=mongodb:mongodb"
      ]
    },
    "enableUploadProgressBar": true,
    "type": "meteor"
  },
  "mongo": {
    "version": "4.4",
    "servers": {
      "one": {}
    },
    "dbName": "eshs-inspection"
  },
  "proxy": {
    "domains": "subdomain.subdomain.host.com",
    "ssl": {
      "letsEncryptEmail": "email@domain.com",
      "forceSSL": true
    }
  }
}

Output of command (mup proxy logs-le)

[104.248.95.147]2020/12/04 10:06:16 Received event die for container d97c429c4478
[104.248.95.147]2020/12/04 10:06:16 Debounce minTimer fired
[104.248.95.147]2020/12/04 10:06:16 Generated '/app/letsencrypt_service_data' from 3 containers
[104.248.95.147]2020/12/04 10:06:16 Running '/app/signal_le_service'
[104.248.95.147]Sleep for 3600s
[104.248.95.147]2020/12/04 10:06:17 Received event start for container 4b5296f95038
[104.248.95.147]2020/12/04 10:06:18 Debounce minTimer fired
[104.248.95.147]2020/12/04 10:06:18 Generated '/app/letsencrypt_service_data' from 4 containers
[104.248.95.147]2020/12/04 10:06:18 Running '/app/signal_le_service'
[104.248.95.147]Creating/renewal subdomain.subdomain.host.com certificates... (subdomain.subdomain.host.com)
[104.248.95.147][Fri Dec  4 10:06:20 UTC 2020] Using CA: https://acme-v02.api.letsencrypt.org/directory
[104.248.95.147][Fri Dec  4 10:06:20 UTC 2020] Single domain='subdomain.subdomain.host.com'
[104.248.95.147][Fri Dec  4 10:06:20 UTC 2020] Getting domain auth token for each domain
[104.248.95.147][Fri Dec  4 10:06:22 UTC 2020] Create new order error. Le_OrderFinalize not found. {
[104.248.95.147]  "type": "urn:ietf:params:acme:error:rateLimited",
[104.248.95.147]  "detail": "Error creating new order :: too many certificates already issued for exact set of domains: subdomain.subdomain.host.com: see https://letsencrypt.org/docs/rate-limits/",
[104.248.95.147]  "status": 429
[104.248.95.147]}
[104.248.95.147][Fri Dec  4 10:06:22 UTC 2020] Please check log file for more details: /dev/null
[104.248.95.147]Sleep for 3600s

What I found on the letsencrypt site is that the limit is set to 5 times a week. This is for renewing the ssl certificate and from what I read this is only done once a month by Meteor up.

I haven't had this issue with previous projects, but I have my suspection of why this could be happening. In the past 2 weeks I changed two things for deploying with Meteor up.

Because of the bug I now have to switch back to an older version (12.18.3) using nvm to get it to deploy. This might be a reason why it's trying to refresh the domain everytime. I also implemented a simple bash script that would do these steps for me, because I kept forgetting about switching the nvm version back.

This is the bash script I used to deploy the project:

#!/bin/bash

export NVM_DIR=$HOME/.nvm;
source $NVM_DIR/nvm.sh;
cd .mup-beta
nvm use 12.18.3
echo 'Deploying from:'
echo $PWD
mup setup
mup deploy
echo 'Finished Deployment!'

I am hoping that this can get fixed soon, because it's currently breaking my application for showing the following error:

subdomain.subdomain.host.com uses an invalid security certificate. The certificate is not trusted because it is self-signed. Error code: MOZILLA_PKIX_ERROR_SELF_SIGNED_CERT

The app is running on the new .app from google, so it is also not accessible without https. I have currently switched to a new subdomain and have been able to deploy there without any issues, but I'm worried the same issue will reappear again after deploying too often.

oinofactordevs commented 3 years ago

+1

oinofactordevs commented 3 years ago

@timsun28 I'm having the same issue with multiple projects, I was thinking that the issue was related to my CI but now I'm guessing that is related to this update on a MUP dependency: https://github.com/nginx-proxy/docker-letsencrypt-nginx-proxy-companion/issues/718

https://github.com/nginx-proxy/docker-letsencrypt-nginx-proxy-companion

I'm trying to figure it out how to use a previous version of this dependency. @zodern can you give me a tip to do that?

timsun28 commented 3 years ago

@oinofactordevs Thank you for mentioning this, I will also have a look into it. I am happy it just happened with my beta domain and I could move it to a different domain, but I'm waiting before pushing a new version to production, because the beta one is still down after a week and I can't seem to get it up again ...

Hope this can get fixed soon or reverted to an old version for the moment.

Update: it just happened with my second beta domain now as well. And for users who use a domain name that only accept https connections (.app for example) they can't even access the site when adding an exception for the site.

oinofactordevs commented 3 years ago

@timsun28 I'm having the issue on multiple platforms, and seems to be related to the latest version of the docker image I shared. The update date of the docker image match with the beginning of this issue (a week ago). The platforms that already had the issue will not work for a week until letsencrypt allows you to get a new certificate. My plan is to change the MUP code on this file (from jrcs/letsencrypt-nginx-proxy-companion:latest --> to jrcs/letsencrypt-nginx-proxy-companion:1.13.1) and try to use the older version of the docker image.

oinofactordevs commented 3 years ago

@timsun28 I think I found the way to get the broken sites back:

1 - Go to your remote server and get the docker containerID assigned to the mup-nginx-proxy-letsencrypt 2 - stop the container: docker stop mup-nginx-proxy-letsencrypt 3 - remove the container: docker rm containerID 4 - On your the machine that you are using for the deployment, locate the mup package installation and change this on the source code: a) - src/plugins/proxy/assets/templates/start.sh

19 -> jrcs/letsencrypt-nginx-proxy-companion:latest -> to -> jrcs/letsencrypt-nginx-proxy-companion:v1.13.1

   #79 -> jrcs/letsencrypt-nginx-proxy-companion -> to -> jrcs/letsencrypt-nginx-proxy-companion:v1.13.1

b) - src/plugins/meteor/assets/templates/start.sh

10 -> LETS_ENCRYPT_VERSION=latest -> to -> LETS_ENCRYPT_VERSION=v1.13.1

5 - Run mup init , mup setup and mup deploy for your project. I'm testing this on all the projects while I'm also writing it so -> sorry for any typo , I hope this could help you , I'm also running crazy with this and I have a lot of production platforms that I can't update until this were fixed.

Let me know if it works for you.

timsun28 commented 3 years ago

@oinofactordevs Thank you for sharing! I will try it tomorrow morning! I really appreciate it that you shared this, and hopefully you get your apps back online as well.

I will keep you updated!

fabian-aramendi commented 3 years ago

@zodern I don't have a GitHub client right now so I imported the repo on bitbucket: https://bitbucket.org/bitskingdom/meteor-up/commits/ff1d6862eaef485a792f9bac127e367f1d2033e8 Please check that commit and let me know if this make sense regarding the @timsun28 issue.

@timsun28 this fixed all my projects. I'm not sure which exactly was the issue but is related to the last version of this docker image dependency: https://github.com/nginx-proxy/docker-letsencrypt-nginx-proxy-companion

I hope this helps!

fabian-aramendi commented 3 years ago

Just as a reminder: I'm suspecting that the issue is related to the location of the certificates on the server, seems like every time that I was deploying on the server, the certificate was requested even if there was a certificate already created. That explains: 1 - why I was able to reproduce the issue on my staging environments (where I have hourly deploys) 2 - why after I changed the docker image the old certificates were recognized and wasn't needed to request a new certificate (have in mind that the same server using the same IP, previous the fix, was reaching the rate limit on letsencryp)

cc @zodern and @timsun28 BTW: I'm also @oinofactordevs

timsun28 commented 3 years ago

@fabian-aramendi I was able to get everything back online thanks to your guide! For others who have mup installed as a global package, you need to make the changes in the following location: ~/.nvm/versions/node/Your node version/lib/node_modules/mup

You can find this location by typing npm root -g in your terminal.

For me the files were also in a lib folder instead of the src folder you mentioned, but it all worked out well.

Reading from the change logs from the nginx-proxy-companion it seems like they released a 2.0 version without backwards compatibility for some parts which caused the issues.

zodern commented 3 years ago

Thanks for finding the cause. This is fixed in Mup 1.5.3.

buchdag commented 3 years ago

Additional informations if your project uses jrcs/letsencrypt-nginx-proxy-companion:latest :

Required read if you use the latest version : the recent v2.0.0 release of this project mark the switch of the ACME client used by the Docker image from simp.le to acme.sh. This switch result in some backward incompatible changes, so please read this issue and the updated docs for more details before updating your image. The single most important change is that the container now requires a volume mounted to /etc/acme.sh in order to persist ACME account keys and SSL certificates. The last tagged version that uses simp_le is v1.13.1.

ajitStephen commented 3 years ago

I have updated to Mup 1.5.3 removed all docker containers from server. ran the deployment again.

But still have error:

ACME server returned an error: urn:ietf:params:acme:error:rateLimited :: There were too many requests of a given type :: Error creating new order :: too many certificates already issued for exact set of domains: #####: seehttps://letsencrypt.org/docs/rate-limits/

ajitStephen commented 3 years ago

I have updated to Mup 1.5.3 removed all docker containers from server. ran the deployment again.

But still have error:

ACME server returned an error: urn:ietf:params:acme:error:rateLimited :: There were too many requests of a given type :: Error creating new order :: too many certificates already issued for exact set of domains: #####: seehttps://letsencrypt.org/docs/rate-limits/

But app is live next morning; Can there be an explanation ?