apache / couchdb

Seamless multi-master syncing database with an intuitive HTTP/JSON API, designed for reliability
https://couchdb.apache.org/
Apache License 2.0
6.29k stars 1.04k forks source link

Replication Problem when using SSL with Nginx frontend and native SSL CouchDB #1475

Closed empeje closed 6 years ago

empeje commented 6 years ago

Long story short SSL s*cks with native SSL or Nginx frontend

Expected Behavior

Current Behavior

Possible Solution

Steps to Reproduce (for bugs)

  1. Install a CouchDB in a server (instance 1)

Here is my setup

vi.yml

services:
  couchdb:
    image: treehouses/couchdb:2.1.2
    restart: always
    volumes:
      - "/srv/vi/conf:/opt/couchdb/etc/local.d"
      - "/srv/vi/data:/opt/couchdb/data"
      - "/srv/vi/log:/opt/couchdb/var/log"
  db-init:
    image: treehouses/planet:db-init-0.2.15
    depends_on:
      - couchdb
    environment:
      - COUCHDB_HOST=http://couchdb:5984
      - COUCHDB_USER=somehinguser
      - COUCHDB_PASS=somethingpass
  planet:
    image: treehouses/planet:0.2.15
    restart: always
    environment:
      - HOST_PROTOCOL=https
      - DB_HOST=vi.media.mit.edu
      - DB_PORT=2200
      - CENTER_ADDRESS=earth.ole.org:2200
    volumes:
      - "/var/run/docker.sock:/var/run/docker.sock"
    depends_on:
      - couchdb
  proxy:
    image: nginx:1.13
    restart: always
    volumes:
      - "/srv/vi/cert/vi.media.mit.edu/:/etc/nginx/certs/vi.media.mit.edu/"
      - "/srv/vi/proxy/app.conf:/etc/nginx/conf.d/app.conf"
    depends_on:
     - planet
    ports:
     - "80:80"
     - "443:443"
     - "2200:2200"
version: "2"

app.conf

server_tokens off;

server {
  listen 80;
  listen [::]:80;

  server_name vi.media.mit.edu;

  location / {
    return 301 https://$host$request_uri;
  }
}

server {
  listen 443 ssl deferred http2 default_server;
  listen [::]:443 ssl deferred http2 default_server;

  ssl_certificate certs/vi.media.mit.edu/fullchain.cer;
  ssl_certificate_key certs/vi.media.mit.edu/vi.media.mit.edu.key;
  ssl_trusted_certificate certs/vi.media.mit.edu/fullchain.cer;

  ssl_session_cache shared:SSL:50m;
  ssl_session_timeout 1d;
  ssl_session_tickets off;

  ssl_prefer_server_ciphers on;
  ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
  ssl_ciphers 'ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256:ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:DHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-AES128-SHA256:ECDHE-RSA-AES128-SHA256:ECDHE-ECDSA-AES128-SHA:ECDHE-RSA-AES256-SHA384:ECDHE-RSA-AES128-SHA:ECDHE-ECDSA-AES256-SHA384:ECDHE-ECDSA-AES256-SHA:ECDHE-RSA-AES256-SHA:DHE-RSA-AES128-SHA256:DHE-RSA-AES128-SHA:DHE-RSA-AES256-SHA256:DHE-RSA-AES256-SHA:ECDHE-ECDSA-DES-CBC3-SHA:ECDHE-RSA-DES-CBC3-SHA:EDH-RSA-DES-CBC3-SHA:AES128-GCM-SHA256:AES256-GCM-SHA384:AES128-SHA256:AES256-SHA256:AES128-SHA:AES256-SHA:DES-CBC3-SHA:!DSS';
  resolver 8.8.8.8 8.8.4.4;
  ssl_stapling on;
  ssl_stapling_verify on;
  server_name vi.media.mit.edu;

  add_header Strict-Transport-Security "max-age=31536000; includeSubdomains; preload";
  # add_header X-Frame-Options SAMEORIGIN always;
  add_header X-Content-Type-Options nosniff always;
  add_header X-XSS-Protection "1; mode=block" always;

  proxy_hide_header X-Powered-By;
  server_tokens off;

  location / {
    rewrite /(.*) /$1  break;
    proxy_pass http://planet:80;
    proxy_redirect     off;
    proxy_set_header   Host $host;
  }

}

# HTTPS server
server {
    ssl_certificate certs/vi.media.mit.edu/fullchain.cer;
    ssl_certificate_key certs/vi.media.mit.edu/vi.media.mit.edu.key;
    ssl_trusted_certificate certs/vi.media.mit.edu/fullchain.cer;

    ssl_session_timeout 5m;
    ssl_ciphers "EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH";
    ssl_protocols TLSv1 TLSv1.1 TLSv1.2;
    ssl_prefer_server_ciphers on;
    add_header Strict-Transport-Security "max-age=63072000; includeSubdomains; preload";
    # add_header X-Frame-Options DENY;
    add_header X-Content-Type-Options nosniff;
    add_header X-Clacks-Overhead "GNU Terry Pratchett";
    ssl_stapling on;
    ssl_stapling_verify on;
    resolver 8.8.4.4 8.8.8.8 valid=300s;
    resolver_timeout 5s;       

    listen 2200 ssl;
    listen [::]:2200 ssl;
    server_name vi.media.mit.edu;
    client_max_body_size 1024M;
    proxy_hide_header X-Powered-By;

    location / {
            proxy_pass http://couchdb:5984;
    }

    }
  1. Create the same instance on the other server (insstance 2)
  2. Create the instance in RaspberryPi, example: planet.yml (instance 3)
  3. Try to upload big files to instance 1 so that instance 2 and 3 can replicate, try a big one. Our case was 2.9 GB.
  4. Try to replicate it to 3 from 1, to 2 from 1 for instance 2 and 3 respectively.

Additional Data

Experiment 1

Experiment 2

[log]
writer = file
file = /opt/couchdb/var/log/couch.log

[chttpd]
bind_address = any

[httpd]
httpd={}
;bind_address = any
;enable_cors = true
;max_http_request_size = 4294967296
max_http_request_size = 1073741824

[replicator]
socket_options = [{keepalive, true}, {nodelay, false}]
checkpoint_interval = 300000
use_checkpoints = true

[cors]
credentials = true
methods = GET, PUT, POST, HEAD, DELETE
origins = *
headers = accept, authorization, content-type, origin, referer, x-csrf-token

[couchdb]
uuid = auuid
max_document_size = 4294967296

[couch_httpd_auth]
secret = anaouth
timeout = 1200
public_fields = name,firstName,middleName,lastName,roles,isUserAdmin,joinDate,email,phoneNumber,gender
users_db_public = true

[satellite]
pin = apin

[admins]

[ssl]
key_file = /etc/nginx/certs/chimera.media.mit.edu/chimera.media.mit.edu.key
cert_file = /etc/nginx/certs/chimera.media.mit.edu/fullchain.cer
cacert_file = /etc/nginx/certs/chimera.media.mit.edu/ca.cer
bind_address = any
port = 6984

[daemons]
httpsd = {chttpd, start_link, [https]}

Result

The left pattern is the Experiment 1, and the right is the Experiment 2

Disk usage

screen shot 2018-07-31 at 11 23 02 pm

Network traffic

screen shot 2018-07-31 at 11 17 50 pm

The Experiment 1 try to sync or replicate the data and until 8 minutes the connection reset, you can see the pattern of disk usage and network traffic above. In the second experiment, we have a continuous connection between couchdb, but every 8 minutes we have some data deleted (seeing the disk usage)

Context

I'm in a project called planet where we build a federated learning management system, similar to what mastodon did but for learning management. Our setup is a central (earth), and nation server deployed in Docker with docker-compose (our application is not that complex and no k8s necessary at least for now), and we have Raspberry Pis deployed in the field in some places where internet connection is still unstable (some small place in Madagascar, Ghana, Nepal, etc). In our central server, we use Lets Encrypt to add SSL encryption and we use Nginx as the reverse proxy. Problem is when we try to replicate a database from CouchDB in the Raspberry Pi, earth or nation server to earth or nation.

At first, we think that Raspberry Pi was the problem, but turned out the behavior is replicable in the CouchDB deployed on the server too. For quick notes, we are using version 2.1.2 (you can check it here). Previously when using CouchDB 1.6 we never this kind of problem and it is using RPi too

Your Environment

nickva commented 6 years ago

Hi @empeje

Thank you for your report. A few questions and comments:

empeje commented 6 years ago

Thanks for your answer @nickva.

So many things to address here.

Regarding the 3Gb files, are those sizes for single document bodies? Are those attachments? Individual or total? Usually storing large individual documents or attachments in that range is considered an anti-pattern for CouchDB. Is there any way to break those up into smaller documents?

It is mostly an attachment. We are distributing learning resource data via CouchDB. I know it is anti-pattern, but we take this risk because it is one of the simplest way to create federated system with low human resource in a social benefit organization like us. Our goal is at least able to replicate 1GB of attachment.

Is it a single replication or are there others (many) running at the same time?

Our server may receives multiple replication request at a time.

Noticed that max http request size and max client body size are set to 1Gb and CouchDB's max document size is set to 4Gb. Perhaps adjust the request size in both Nginx and CouchDb to a large value. For max request size, you'd want it greater than the sum of all the document's revisions and the total size of its attachments.

Thank you for catching this, but we also got a problem when we replicate data under 1 GB. What I say 2.7 GB here is basically a complete database consists of several record with its attachment.

Regarding Experiment 1, with 2.1.2 release, should apply some of those max size configs, as opposed to using the defaults. In that release max request size was 64Mb so the default values won't work, especially if you have individual doc revisions + attachment sizes that exceed that.

Thats why we led to Experiment 2, we still got a problem when we increase the max request size.

Do you know if smaller sizes work, what are the limits that work (1Gb, 100Mb,...)?

For sure we have success replicating 121 MB record with 120 MB portion in form of attachment.

Perhaps try replicating on a local network without SSL or Nginx as a debugging experiment...

We are on the way to debug this, will update the result later.

Inspect the logs on source and target. Do you see 413 http errors, timeouts? Especially see if you can notice the part when errors start happening. A 413 error, either sent by CouchDB or Nginx, might indicate that some of the max size limits might be applied.

We haven't notice the 413 error but good to know this, so I can debug further.

use_checkpoints = true is the default for the replicator, no need to set it explicitly.

Thanks, good to know this. We're little bit desperate with this result so we try various way hoping some of them work.

There is also http://docs.couchdb.org/en/stable/config/replicator.html#replicator/connection_timeout setting, maybe adjusting that up a bit might help.

Thanks

Checkpoints are usually not that expensive, just an update of a local document on target and source. Try not to delay it too much. I would reduce that to something closer to the default.

I tried, but wonder if checkpoint is also working with attachment?


Thanks for the support. We are currently experimenting with plain no SSL setup in local and with HAProxy, and also will try some of your suggestion. Hope some of them work and then I able to share here.

nickva commented 6 years ago

Also the new release (2.2) which is being finalized has some replication improvements relating to attachment uploads.

http://docs.couchdb.org/en/2.2.0/whatsnew/2.2.html#version-2-2-0

You could wait till the release is out or try building it during your testing, to see if improvements help.

wohali commented 6 years ago

@empeje fyi we do have people replicating behind nginx, or with native SSL, and finding it works OK for them. I suspect all of your problems are specifically related to attachment usage.

Have you had a chance to try the 2.2.0 RCs at all?

empeje commented 6 years ago

@wohali @nickva we just do a little experiment a few days ago with 2.1.2 with this config

[log]
writer = file
file = /opt/couchdb/var/log/couch.log

[chttpd]
bind_address = any

[httpd]
bind_address = any
enable_cors = true
max_http_request_size = 4294967296

[couchdb]
max_document_size = 4294967296
uuid = anuuid

[replicator]
socket_options = [{keepalive, true}, {nodelay, false}]
checkpoint_interval = 5000
use_checkpoints = true

[cors]
origins = *
credentials = true
methods = GET, PUT, POST, HEAD, DELETE
headers = accept, authorization, content-type, origin, referer, x-csrf-token

[couch_httpd_auth]
timeout = 1200
users_db_public = true
public_fields = name,firstName,middleName,lastName,roles,isUserAdmin,joinDate,email,phoneNumber,gender
secret = asecret

And we're able to replicate 1.5 GBs of attachment. I suspect there is some problem with my SSL setup.

Anyway, I'll try the 2.2.0 RCs also. Thanks @wohali

wohali commented 6 years ago

Closing as it sounds like it's not a CouchDB issue specifically - if you come up with something we can help with explicitly, let us know and we can re-open it.

empeje commented 6 years ago

Thanks @wohali