okfn-brasil / jarbas

🎩 API for information and suspicions about reimbursements by Brazilian congresspeople
https://jarbas.serenata.ai/
296 stars 61 forks source link

Use SSL Offload and auto certificate renewall #293

Closed ltouro closed 6 years ago

ltouro commented 6 years ago

Change prod docker-compose file to use proxy to handle SSL Offload and auto SSL certificate renewal.

What is the purpose of this Pull Request? Easy the certificate renewall and related operations

What was done to achieve this purpose? Used jwilder/nginx-proxy and jrcs/letsencrypt-nginx-proxy-companion container images.

How to test if it really works?

docker-compose up -d && docker-compose logs -f --tail=10

Certificate shall be generated on start.

Who can help reviewing it? cuducos

TODO

ltouro commented 6 years ago

Close #291

ltouro commented 6 years ago

We can avoid bind-mounting docker socket on the edge proxy -- considereing it is exposed on the internet -- by using yet another container to generate the nginx templates. Please advise if that is required.

cuducos commented 6 years ago

Hi @ltouro, many thanks for that!

I tried it out but got an error:

$ docker-compose -f docker-compose.yml -f docker-compose.prod.yml up
ERROR: yaml.parser.ParserError: while parsing a block mapping
  in "./docker-compose.prod.yml", line 1, column 1
expected <block end>, but found '<block mapping start>'
  in "./docker-compose.prod.yml", line 22, column 3

I think the proxy block is indented with 4 spaces and the rest of the services is indented with 2 spaces.

Other comments:

  1. Should we support Swarm Mode? Not necessary right now. In the future that could be useful!
  2. Is jwilder/nginx-proxy working with docker-compose.yml version 3? It used to require version 2.
  3. As you edited docker-compose.prod.yml I guess the How to test it (opening message of the PR) part should be docker-compose -f docker-compose.yml -f docker-compose.prod.yml up -d, not only docker-compose up -d… is that right or am I missing something here?
cuducos commented 6 years ago

Fixing the indentation issue I got:

ERROR: Named volume "proxy/certs:/etc/nginx/certs:rw" is used in service "proxy" but no declaration was found in the volumes section.
ltouro commented 6 years ago

@cuducos Sorry about that. Misunderstood named volumes as I'm still using bind-mount on my case. It should work now.

About jwilder/nginx-proxy, I'm currently using it with compose v3 with no problem.

About testing, you are right. You should use both compose files as they are incremental (did not notice that previously)

I removed the "bridge" network setting in favor of the engine default (which changes to overlay if u are in Swarm mode, cool).

ltouro commented 6 years ago

@caduvieira Thanks for the comments. All the content that goes on these named volumes are ephemeral and will be recreated in absence. Only the Diffie–Hellman key take couple minutes to create but I think that is okay.

cuducos commented 6 years ago

It seams to be working now. Any way you'd recommend to test it locally (even without HTTPS) just to check before putting it in production servers? Maybe changing VIRTUAL_HOST and LETSENCRYPT_HOST to something I could edit in my /etc/hosts?

Otherwise I'll try setting up the DNS to new servers for a brief one of these days.

cuducos commented 6 years ago

BTW my I suggest parametrizing these …_HOST environment vars so we can easily set up staging and production environments? Something along these lines.

ltouro commented 6 years ago

@cuducos /etc/hosts will not work in this case because the process involves external servers that will check the existence of a specific file at domain.tld/.well-know. So you will need the DNS properly configured to test the certificate generation.

ltouro commented 6 years ago

@cuducos Should we provide a default value for LETSENCRYPT_EMAIL? It is for expire notices from Let's Encrypt.

      LETSENCRYPT_HOST: ${VIRTUAL_HOST_WEB}
      VIRTUAL_HOST: ${VIRTUAL_HOST_WEB}
      LETSENCRYPT_EMAIL: ${LETSENCRYPT_EMAIL-jarbas@serenatadeamor.org}
cuducos commented 6 years ago

/etc/hosts will not work in this case because the process involves external servers that will check the existence of a specific file at domain.tld/.well-know

It wouldn't work even with HTTP on port 80? Ok…

Should we provide a default value for LETSENCRYPT_EMAIL? It is for expire notices from Let's Encrypt.

Yep… it could be op.serenatadeamor@gmail.com ; )

Many thanks!

ltouro commented 6 years ago

@cuducos I see what you mean. You could test HTTP using HTTPS_METHOD env with "noredirect" value (I parametrized that too). This way, you can still access the container on port 80. Otherwise, the edge-proxy will redirect you to HTTPs automatically.

cuducos commented 6 years ago

Hi @ltouro, would you mind helping properly testing the PR?

Not sure if I am forgetting a step or something…

  1. Added 127.0.0.1 local.jarbas.serenatadeamor.org and 0.0.0.0 local.jarbas.serenatadeamor.org to my /etc/hosts
  2. Added VIRTUAL_HOST_WEB=local.jarbas.serenatadeamor.org and HTTPS_METHOD=noredirect to .env.prod
  3. docker-compose -f docker-compose.yml -f docker-compose.prod.yml up and waited several minutes (I know it takes a while to generate the proper hashes)
  4. Still port 80 responds with 502 and 443 non-responsive:
$ curl -I local.jarbas.serenatadeamor.org:80
HTTP/1.1 502 Bad Gateway
Server: nginx/1.13.6
Date: Mon, 04 Dec 2017 15:44:24 GMT
Content-Type: text/html
Content-Length: 173
Connection: keep-alive

$ curl -I local.jarbas.serenatadeamor.org:443
curl: (52) Empty reply from server

Here are my containers:

$ docker-compose -f docker-compose.yml -f docker-compose.prod.yml ps
WARNING: Some services (proxy) use the 'deploy' key, which will be ignored. Compose does not support 'deploy' configuration - use `docker stack deploy` to deploy to a swarm.
        Name                      Command               State                     Ports
---------------------------------------------------------------------------------------------------------
jarbas_django_1        gunicorn jarbas.wsgi:appli ...   Up       8000/tcp
jarbas_elm_1           npm run assets                   Exit 0
jarbas_memcached_1     docker-entrypoint.sh memcached   Up       11211/tcp
jarbas_nginx_1         nginx -g daemon off;             Up       80/tcp
jarbas_postgres_1      docker-entrypoint.sh postgres    Up       5432/tcp
jarbas_proxy-certs_1   /bin/bash /app/entrypoint. ...   Up
jarbas_queue_1         docker-entrypoint.sh rabbi ...   Up       25672/tcp, 4369/tcp, 5671/tcp, 5672/tcp
jarbas_tasks_1         /bin/sh -c celery worker - ...   Up       8000/tcp
proxy                  /app/docker-entrypoint.sh  ...   Up       0.0.0.0:443->443/tcp, 0.0.0.0:80->80/tcp```

And here are the logs:

```console elm_1 | npm info it worked if it ends with ok elm_1 | npm info using npm@5.3.0 elm_1 | npm info using node@v8.4.0 postgres_1 | LOG: database system was shut down at 2017-12-04 15:40:24 UTC proxy-certs_1 | Reloading nginx proxy (proxy)... postgres_1 | LOG: MultiXact member wraparound protections are now enabled proxy | forego | starting dockergen.1 on port 5000 elm_1 | npm info lifecycle @~preassets: @ proxy-certs_1 | 2017/12/04 15:42:10 Generated '/app/letsencrypt_service_data' from 4 containers postgres_1 | LOG: database system is ready to accept connections elm_1 | npm info lifecycle @~assets: @ proxy-certs_1 | 2017/12/04 15:42:10 Running '/app/update_certs' proxy | forego | starting nginx.1 on port 5100 postgres_1 | LOG: autovacuum launcher started elm_1 | proxy-certs_1 | 2017/12/04 15:42:10 Watching docker events elm_1 | > @ assets /code elm_1 | > gulp elm proxy-certs_1 | Sleep for 3600s elm_1 | django_1 | [2017-12-04 13:42:12 -0200] [1] [INFO] Starting gunicorn 19.7.1 proxy-certs_1 | Q2017/12/04 15:42:10 Generated '/etc/nginx/conf.d/default.conf' from 4 containers proxy | dockergen.1 | 2017/12/04 15:42:10 Generated '/etc/nginx/conf.d/default.conf' from 6 containers django_1 | [2017-12-04 13:42:12 -0200] [1] [INFO] Listening at: http://0.0.0.0:8000 (1) proxy-certs_1 | ;2017/12/04 15:42:10 [notice] 51#51: signal process started elm_1 | [15:42:11] Using gulpfile /code/gulpfile.js django_1 | [2017-12-04 13:42:12 -0200] [1] [INFO] Using worker: sync proxy-certs_1 | 2017/12/04 15:42:11 Contents of /app/letsencrypt_service_data did not change. Skipping notification '/app/update_certs' proxy-certs_1 | 2017/12/04 15:42:11 Received event start for container 2ed5a7ebcc92 elm_1 | [15:42:11] Starting 'elm'... django_1 | [2017-12-04 13:42:12 -0200] [7] [INFO] Booting worker with pid: 7 proxy | dockergen.1 | 2017/12/04 15:42:10 Running 'nginx -s reload' django_1 | [2017-12-04 13:42:12 -0200] [9] [INFO] Booting worker with pid: 9 proxy-certs_1 | 2017/12/04 15:42:12 Received event start for container 90a1dc056e8a proxy-certs_1 | 2017/12/04 15:42:13 Received event start for container b6f02214eb0f proxy | dockergen.1 | 2017/12/04 15:42:10 Watching docker events proxy | dockergen.1 | 2017/12/04 15:42:11 Contents of /etc/nginx/conf.d/default.conf did not change. Skipping notification 'nginx -s reload' proxy | dockergen.1 | 2017/12/04 15:42:12 Received event start for container 90a1dc056e8a proxy | dockergen.1 | 2017/12/04 15:42:12 Contents of /etc/nginx/conf.d/default.conf did not change. Skipping notification 'nginx -s reload' proxy | dockergen.1 | 2017/12/04 15:42:13 Received event start for container b6f02214eb0f django_1 | [2017-12-04 13:42:13 -0200] [11] [INFO] Booting worker with pid: 11 django_1 | [2017-12-04 13:42:13 -0200] [13] [INFO] Booting worker with pid: 13 proxy | dockergen.1 | 2017/12/04 15:42:13 Generated '/etc/nginx/conf.d/default.conf' from 9 containers proxy | dockergen.1 | 2017/12/04 15:42:13 Running 'nginx -s reload' proxy | dockergen.1 | 2017/12/04 15:42:13 Contents of /etc/nginx/conf.d/default.conf did not change. Skipping notification 'nginx -s reload' tasks_1 | /usr/local/lib/python3.5/site-packages/celery/platforms.py:795: RuntimeWarning: You're running the worker with superuser privileges: this is tasks_1 | absolutely not recommended! tasks_1 | tasks_1 | Please specify a different user using the -u option. tasks_1 | tasks_1 | User information: uid=0 euid=0 gid=0 egid=0 tasks_1 | tasks_1 | uid=uid, euid=euid, gid=gid, egid=egid, tasks_1 | tasks_1 | -------------- celery@2ed5a7ebcc92 v4.1.0 (latentcall) tasks_1 | ---- **** ----- tasks_1 | --- * *** * -- Linux-4.9.49-moby-x86_64-with 2017-12-04 13:42:14 tasks_1 | -- * - **** --- tasks_1 | - ** ---------- [config] tasks_1 | - ** ---------- .> app: jarbas:0x7f2cb2f90240 tasks_1 | - ** ---------- .> transport: amqp://guest:**@queue:5672// tasks_1 | - ** ---------- .> results: disabled:// tasks_1 | - *** --- * --- .> concurrency: 3 (prefork) tasks_1 | -- ******* ---- .> task events: OFF (enable -E to monitor tasks in this worker) tasks_1 | --- ***** ----- tasks_1 | -------------- [queues] tasks_1 | .> celery exchange=celery(direct) key=celery tasks_1 | tasks_1 | tasks_1 | [2017-12-04 13:42:14,575: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@queue:5672//: [Errno 111] Connection refused. tasks_1 | Trying again in 2.00 seconds... tasks_1 | queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:14 === queue_1 | Starting RabbitMQ 3.6.11 on Erlang 19.3 queue_1 | Copyright (C) 2007-2017 Pivotal Software, Inc. queue_1 | Licensed under the MPL. See http://www.rabbitmq.com/ queue_1 | queue_1 | RabbitMQ 3.6.11. Copyright (C) 2007-2017 Pivotal Software, Inc. queue_1 | ## ## Licensed under the MPL. See http://www.rabbitmq.com/ queue_1 | ## ## queue_1 | ########## Logs: tty queue_1 | ###### ## tty queue_1 | ########## queue_1 | Starting broker... queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:14 === queue_1 | node : rabbit@e31fd7358949 queue_1 | home dir : /var/lib/rabbitmq queue_1 | config file(s) : /etc/rabbitmq/rabbitmq.config queue_1 | cookie hash : w37r5jjlyGKvIuzj6jxWtA== queue_1 | log : tty queue_1 | sasl log : tty queue_1 | database dir : /var/lib/rabbitmq/mnesia/rabbit@e31fd7358949 tasks_1 | [2017-12-04 13:42:16,599: ERROR/MainProcess] consumer: Cannot connect to amqp://guest:**@queue:5672//: [Errno 111] Connection refused. tasks_1 | Trying again in 4.00 seconds... tasks_1 | queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Memory high watermark set to 1979 MiB (2075975680 bytes) of 4949 MiB (5189939200 bytes) total queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Enabling free disk space monitoring queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Disk free limit set to 50MB queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Limiting to approx 1048476 file handles (943626 sockets) queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | FHC read buffering: OFF queue_1 | FHC write buffering: ON queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Waiting for Mnesia tables for 30000 ms, 9 retries left queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Waiting for Mnesia tables for 30000 ms, 9 retries left queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Priority queues enabled, real BQ is rabbit_variable_queue queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Starting rabbit_node_monitor queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | msg_store_transient: using rabbit_msg_store_ets_index to provide index queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | msg_store_persistent: using rabbit_msg_store_ets_index to provide index queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | started TCP Listener on [::]:5672 queue_1 | completed with 0 plugins. queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:18 === queue_1 | Server startup complete; 0 plugins started. queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:20 === queue_1 | accepting AMQP connection <0.290.0> (172.21.0.6:46830 -> 172.21.0.3:5672) queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:21 === queue_1 | connection <0.290.0> (172.21.0.6:46830 -> 172.21.0.3:5672): user 'guest' authenticated and granted access to vhost '/' queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:21 === queue_1 | accepting AMQP connection <0.298.0> (172.21.0.6:46832 -> 172.21.0.3:5672) queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:21 === queue_1 | connection <0.298.0> (172.21.0.6:46832 -> 172.21.0.3:5672): user 'guest' authenticated and granted access to vhost '/' queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:21 === queue_1 | accepting AMQP connection <0.317.0> (172.21.0.6:46834 -> 172.21.0.3:5672) queue_1 | queue_1 | =INFO REPORT==== 4-Dec-2017::15:42:21 === queue_1 | connection <0.317.0> (172.21.0.6:46834 -> 172.21.0.3:5672): user 'guest' authenticated and granted access to vhost '/' proxy-certs_1 | 2017/12/04 15:42:28 Debounce minTimer fired proxy-certs_1 | 2017/12/04 15:42:28 Generated '/app/letsencrypt_service_data' from 7 containers proxy-certs_1 | 2017/12/04 15:42:28 Running '/app/update_certs' proxy-certs_1 | Creating/renewal local.jarbas.serenatadeamor.org certificates... (local.jarbas.serenatadeamor.org) proxy-certs_1 | 2017-12-04 15:42:29,352:INFO:simp_le:1213: Generating new account key elm_1 | [15:42:35] Finished 'elm' after 24 s elm_1 | npm info lifecycle @~postassets: @ elm_1 | npm info ok proxy | dockergen.1 | 2017/12/04 15:42:35 Received event die for container e8d50b151a8a proxy-certs_1 | 2017/12/04 15:42:35 Received event die for container e8d50b151a8a jarbas_elm_1 exited with code 0 proxy | dockergen.1 | 2017/12/04 15:42:36 Contents of /etc/nginx/conf.d/default.conf did not change. Skipping notification 'nginx -s reload' proxy-certs_1 | TOS hash mismatch. Found: cc88d8d9517f490191401e7b54e9ffd12a2b9082ec7a1d4cec6101f9f1647e7b. proxy-certs_1 | proxy-certs_1 | Debugging tips: -v improves output verbosity. Help is available under --help. proxy-certs_1 | Sleep for 3600s proxy-certs_1 | 2017/12/04 15:42:50 Debounce minTimer fired proxy-certs_1 | 2017/12/04 15:42:50 Contents of /app/letsencrypt_service_data did not change. Skipping notification '/app/update_certs' proxy | nginx.1 | local.jarbas.serenatadeamor.org 172.20.0.1 - - [04/Dec/2017:15:44:24 +0000] "HEAD / HTTP/1.1" 502 0 "-" "curl/7.54.0" nginx_1 | 2017/12/04 15:44:24 [error] 5#5: *1 django could not be resolved (3: Host not found), client: 172.20.0.3, server: localhost, request: "HEAD / HTTP/1.1", host: "local.jarbas.serenatadeamor.org" nginx_1 | 172.20.0.3 - - [04/Dec/2017:15:44:24 +0000] "HEAD / HTTP/1.1" 502 0 "-" "curl/7.54.0" ```

I see that nginx container logged django could not be resolved (3: Host not found) — so the proxy using django (name of the container) might not be working… Any idea about what I might be leaving behind?

ltouro commented 6 years ago

@cuducos I think the error is related to not including the django container in the newly created network backend. All HTTP-exposed containers should be added to this network.

We need something like (docker-compose-prod.yml:32~44):

django:
    env_file:
      - .env
    environment:
       - DEBUG=False
    depends_on:
       - memcached
    expose:
      - "8000"
    volumes:
      - assets:/code/staticfiles
    entrypoint: ["gunicorn", "jarbas.wsgi:application", "--reload", "--bind", "0.0.0.0:8000", "--workers", "4"]
    networks:
      - backend

Should I update this PR or we follow on #294?

cuducos commented 6 years ago

@cuducos I think the error is related to not including the django container in the newly created network backend. All HTTP-exposed containers should be added to this network.

Many thanks, I haven't had the chance to test it yet, but I'll test it soon!

Should I update this PR or we follow on #294?

Baby steps. First let's make sure this PR is merged, then we look at #294 ; ) Would you mind updating it here?

cuducos commented 6 years ago

Yay! Now it works : ) Many many thanks for all the support @ltouro! 🎉

I did some minor additions to your branch. Would you mind cherry picking these changes to your PR? This is the branch with my suggestions. Namely:

Idea Commit
Document new production variables f8b6f32e1dec6ece93d5f3e37fd5730cf27ba793
Auto-set Django ALLOWED_HOSTS with the host name you added as a envvar 72c550e3c2e82485ce9173b43708b8a1e6e1b96a
I opted to keep .env with local/dev values and add a .env.prod for production bd0d85ee81e7fb95c8b44ebe765872991b9ba72b

What do you think?


On other news: do you use Twitter? Probably gonna tweet a huge thank you, this contribution is very very important ; )

ltouro commented 6 years ago

Picked up your suggestions. First time I cherry-pick, hope I got it right.

I don't use Twitter, but feel congratulated already! Thanks :dog:

cuducos commented 6 years ago

Fix #291

cuducos commented 6 years ago

I don't use Twitter, but feel congratulated already! Thanks 🐶

https://twitter.com/cuducos/status/938096112409415682