webplatform / ops

http://webplatform.github.io/ops/
5 stars 1 forks source link

Harmonize internal backends and how to expose services to the public #115

Closed renoirb closed 9 years ago

renoirb commented 9 years ago

We reached a point in the way we expose web applications where it creates a set of issues that needs to be addressed.

Most web applications are configured in a way that it has to run both web server and backend code from the same virtual machine, preventing us to scale our capacity.

Also...

Context Before this project, it was impossible to implement "horizontal scaling" by adding frontend and/or internal upstream nodes.

Requirements

An One-to-one convention. Each public name (e.g. notes.webplatform.org), we proxy it to internal upstream service on the private network, exposing an HTTP server on a pre-determined port specific for each web application.

With this, we’ll be able to limit the number of public IP addresses while still allow us to scale by adding internal upstream node without being impacted by the number of web apps we are running.

upstream

  1. Run backend runtime service (e.g. Hypothesis’ gunicorn, php5-fpm fastcgi, etc) and listen on unix socket or loopback (e.g. 127.0.0.1)
    1. Monit to ensure backend runtime is up and alive
  2. Minimal NGINX virtual host (called "upstream") listening on private network serving on port 80 a port defined per web application.
    1. Monit to ensure NGINX is up and alive
    2. Expose to private network service status information of both NGINX and Runtime (when applicable)
    3. Handle aliases, requests rewrites, and optionally other optimizations
    4. Serve static assets so that the public frontend can proxy it without needing to install the web app only to serve static assets
  3. Internal DNS to list upstream
    1. Salt to update internal DNS to CRUD A records so the DNS always knows which private IPs can handle a given web application backend but use it as a query mechanism to generate config file? Or not use it. TBD
    2. Assign internal name per backend (e.g. notes.backend.production.wpdn)

      public

  4. Listen to public IP address, proxy requests to internal upstream web server. e.g. notes.webplatform.org to proxy to notes.backend.production.wpdn internal upstream nodes that serves on port 8005 that’s assigned to our Hypothesis instance.
  5. Redirect non SSL requests to its equivalent through SSL
  6. Abstracts upstream server load balancing by calling internal DNS name which gives out the list of available servers, see NGINX upstream feature
  7. Handle response filtering prior to serve
  8. In case a internal upstream server is broken, serve a human-friendly error page

    Nice to have

  9. Have web apps that are limited to run under apache and mpm-prefork (i.e. MediaWiki) to be proxied the same way as any other backends
  10. Handle asset caching (e.g. keep a local copy from backend response without need to install any web application to be installed locally only for static assets)
  11. Gradually switch all Apache2 configuration
  12. In case of static site, strip cookies, Pragma, etc

    // requires HttpHeadersMore that’s part of nginx-extras
    // ref: http://wiki.nginx.org/NginxHttpHeadersMoreModule#more_clear_headers
    expires    2m;
    add_header Cache-Control public;
    more_clear_headers Set-Cookie Cache-Control Pragma;

Life of an HTTP request

This illustrates where an HTTP request goes and passes through in order to serve a response to the client.

nginx frontend

While Fastly encapsulates caching and load balancing, if an application needs to be served directly by us, we cannot scale unless we rework the configuration accordingly.


VirtualHosts

Each endpoints will have both an internal and a public virtual host configuration.

Priorities

renoirb commented 9 years ago

Don’t forget to update notes in https://docs.webplatform.org/wiki/WPD:Infrastructure/architecture/Things_to_consider_when_we_expose_service_via_Fastly_and_Varnish

renoirb commented 9 years ago

Ensure tracking.webplatform.org redirects to https://stats.webplatform.org/ (testthewebforward.org uses it)

renoirb commented 9 years ago

Refactor progress:

renoirb commented 9 years ago

Proposed architecture

Each upstream service (i.e. the basic web server from a web application as a common denominator) has a set of nodes and is assigned a port number. A web application can run from a VM (like it originally was) but would now support if another VM runs a Docker container (anything that exposes an HTTP service).

Having a port ensures that we can separate which service run on an internal IP address, without relying on the Host: ... nor change an HTTP header, but it also separate the need to use a DNS server too. By doing so, it solve possible outages due to a reboot, or outdated information, and also speed up time to render the request by eliminating a DNS query.

A simple map stating which IP answers for the desired web application is then used to generate the configuration of the public frontend servers.

# Upstream pillar
upstream:
  notes:
    port: 8005
    nodes:
      - 10.10.10.157
# ...

A web application then has two virtual hosts, one for the internal network ("upstream"), one for the public ("frontend") server.

Frontend virtual host

# Generated automatically from the Upstream pillar
upstream upstream_hypothesis {
    server    10.10.10.157:8005;
    server    10.10.10.151:8005;
    server    10.10.10.17:8005;
}

server {
    listen      80;
    server_name notes.webplatform.org;
    include     common_params;
    return      301 https://notes.webplatform.org$request_uri;
}

server {
    listen      443 ssl spdy;
    server_name notes.webplatform.org;

    root    /var/www/html;
    include common_params;
    include ssl_params;

    # SSL configs...

    location / {
        proxy_pass http://upstream_hypothesis;
        include proxy_params;
        proxy_intercept_errors on;

        # WebSocket support plz
        proxy_http_version 1.1;
        proxy_set_header   Upgrade    $http_upgrade;
        proxy_set_header   Connection "upgrade";
    }
}

Upstream virtual host

server {
    listen  8005;

    root    /srv/webplatform/notes-server/notes_server/static;
    include common_params;

    rewrite ^/app/embed.js$ /annotator.js permanent;

    location = /annotator.js {
      rewrite    ^/annotator.js$  /embed.js last;
    }

    rewrite      ^/assets/notes-server/(.*)\? /$1;

    location / {
        proxy_pass   http://127.0.0.1:8001;
        include          proxy_params;

        # Since we are not using SSL internally, we have to force it here.
        # ref: https://docs.djangoproject.com/en/dev/ref/settings/#secure-proxy-ssl-header
        proxy_set_header X-Forwarded-Proto https;

        # Those seems to help with GUnicorn "Broken pipe" socket errors.
        # Assumption is that due to buffering because we have two NGINX servers
        # handles requests to GUnicorn; the internal (this one) and the public frontend.
        # Quoting GUnicorn docs: "If you want to be able to handle streaming request/responses or
        # other fancy features like Web sockets, you need to turn off the proxy buffering".
        # ref: http://gunicorn-docs.readthedocs.org/en/latest/deploy.html
        proxy_buffering off;

        # If we were to use a different backend hostname, we’d have to force it
        # like this.
        #proxy_set_header Host notes.webplatform.org;
    }
}
renoirb commented 9 years ago

Currently working on making names consistent. In states so far the concepts changed name many times from local, backend and upstream, but so the reference to a site (e.g. notes.webplatform.org) and the web application software running it (e.g. hypothesis). Sometimes a virtual host would run more than one web application. Creating confusion in the naming. This has to be handled too.

WebPlatformDocs commented 9 years ago

Everything went well, ready to merge.

renoirb commented 9 years ago

Note to self.

Forget about using internal name (i.e. notes.webplatform.org to be refered to as notes.backend.production.wpdn) let’s save in communication round trips and configure NGINX to know exactly which IP addresses can serve a web app. The NGINX upstream configuration will be generated as required by Salt stack.

renoirb commented 9 years ago

Sent in "Work status update" email on devrel list