Closed renoirb closed 9 years ago
Ensure tracking.webplatform.org redirects to https://stats.webplatform.org/ (testthewebforward.org uses it)
Refactor progress:
Each upstream service (i.e. the basic web server from a web application as a common denominator) has a set of nodes and is assigned a port number. A web application can run from a VM (like it originally was) but would now support if another VM runs a Docker container (anything that exposes an HTTP service).
Having a port ensures that we can separate which service run on an internal IP address, without relying on the Host: ...
nor change an HTTP header, but it also separate the need to use a DNS server too. By doing so, it solve possible outages due to a reboot, or outdated information, and also speed up time to render the request by eliminating a DNS query.
A simple map stating which IP answers for the desired web application is then used to generate the configuration of the public frontend servers.
# Upstream pillar
upstream:
notes:
port: 8005
nodes:
- 10.10.10.157
# ...
A web application then has two virtual hosts, one for the internal network ("upstream
"), one for the public ("frontend
") server.
# Generated automatically from the Upstream pillar
upstream upstream_hypothesis {
server 10.10.10.157:8005;
server 10.10.10.151:8005;
server 10.10.10.17:8005;
}
server {
listen 80;
server_name notes.webplatform.org;
include common_params;
return 301 https://notes.webplatform.org$request_uri;
}
server {
listen 443 ssl spdy;
server_name notes.webplatform.org;
root /var/www/html;
include common_params;
include ssl_params;
# SSL configs...
location / {
proxy_pass http://upstream_hypothesis;
include proxy_params;
proxy_intercept_errors on;
# WebSocket support plz
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
server {
listen 8005;
root /srv/webplatform/notes-server/notes_server/static;
include common_params;
rewrite ^/app/embed.js$ /annotator.js permanent;
location = /annotator.js {
rewrite ^/annotator.js$ /embed.js last;
}
rewrite ^/assets/notes-server/(.*)\? /$1;
location / {
proxy_pass http://127.0.0.1:8001;
include proxy_params;
# Since we are not using SSL internally, we have to force it here.
# ref: https://docs.djangoproject.com/en/dev/ref/settings/#secure-proxy-ssl-header
proxy_set_header X-Forwarded-Proto https;
# Those seems to help with GUnicorn "Broken pipe" socket errors.
# Assumption is that due to buffering because we have two NGINX servers
# handles requests to GUnicorn; the internal (this one) and the public frontend.
# Quoting GUnicorn docs: "If you want to be able to handle streaming request/responses or
# other fancy features like Web sockets, you need to turn off the proxy buffering".
# ref: http://gunicorn-docs.readthedocs.org/en/latest/deploy.html
proxy_buffering off;
# If we were to use a different backend hostname, we’d have to force it
# like this.
#proxy_set_header Host notes.webplatform.org;
}
}
Currently working on making names consistent. In states so far the concepts changed name many times from local, backend and upstream, but so the reference to a site (e.g. notes.webplatform.org) and the web application software running it (e.g. hypothesis). Sometimes a virtual host would run more than one web application. Creating confusion in the naming. This has to be handled too.
Everything went well, ready to merge.
Note to self.
Forget about using internal name (i.e. notes.webplatform.org to be refered to as notes.backend.production.wpdn) let’s save in communication round trips and configure NGINX to know exactly which IP addresses can serve a web app. The NGINX upstream configuration will be generated as required by Salt stack.
Sent in "Work status update" email on devrel list
We reached a point in the way we expose web applications where it creates a set of issues that needs to be addressed.
Most web applications are configured in a way that it has to run both web server and backend code from the same virtual machine, preventing us to scale our capacity.
Also...
Context Before this project, it was impossible to implement "horizontal scaling" by adding frontend and/or internal upstream nodes.
Requirements
Host: ...
header by usingIP:PORT
, let frontend service handle header cleanup to improve caching, etc)Limitations
Proposed conventions
An One-to-one convention. Each public name (e.g. notes.webplatform.org), we proxy it to internal upstream service on the private network, exposing an HTTP server on a pre-determined port specific for each web application.
With this, we’ll be able to limit the number of public IP addresses while still allow us to scale by adding internal upstream node without being impacted by the number of web apps we are running.
upstream
on port 80a port defined per web application.can handle a given web application backendbut use it as a query mechanism to generate config file? Or not use it. TBDAssign internal name per backend (e.g. notes.backend.production.wpdn)public
notes.webplatform.org
to proxy tointernal upstream nodes that serves on portnotes.backend.production.wpdn
8005
that’s assigned to our Hypothesis instance.Nice to have
In case of static site, strip cookies, Pragma, etc
Life of an HTTP request
This illustrates where an HTTP request goes and passes through in order to serve a response to the client.
While Fastly encapsulates caching and load balancing, if an application needs to be served directly by us, we cannot scale unless we rework the configuration accordingly.
VirtualHosts
Each endpoints will have both an internal and a public virtual host configuration.
Priorities
directly, by frontends.webplatform.org)directly, by frontends.webplatform.org)directly, by frontends.webplatform.org)directly, by frontends.webplatform.org)directly, by frontends.webplatform.org)directly, by frontends.webplatform.org)directly, by frontends.webplatform.org)Ref