NeuroLibre web server that serves both static files and API endpoints. Please see documentation pages for [learning]() about, [deploying]() and [debugging]() this full-stack server component of NeuroLibre ecosystem.
Static files are the reproducible preprint content (HTML, CSS, JS, etc.) that are generated in one of the following cases:
Cases 1-2 are handled on the preview
server (on Compute Canada Arbutus to preview.conp.cloud
), while case 3 is handled on the production
server (on NeuroLibre's own cloud to preprint.conp.cloud
), both making the respective content available to the public internet.
Under the hood, we use NGINX to serve static content. To manage the DNS records for the domain conp.cloud
over which NGINX serves the content, we are using Cloudflare. Cloudflare also provides SSL/TLS encryption and CDN (content delivery network, not Cotes-des-Neiges :), a tiny Montrealer joke).
A good understanding of these concepts is essential for successfully deploying NeuroLibre's reproducible preprints to production. Make sure you have a solid grasp of these concepts before proceeding with the deployment instructions.
An application programming interface (API) endpoint is a specific location within NeuroLibre server (e.g., preview.conp.cloud/api/books/all
) that provides access to resources and functionality that are available (e.g., list reproducible preprints on this server):
Some of the resources and functions available on the preview
and production
servers differ. For instance, only the production
server is responsible for archiving the finalized preprint content on Zenodo, while JupyterBook builds are currently only executed on the preview
server.
On the other hand, there are common
resources and functions shared between thepreview
and production
servers, such as retrieving a reproducible preprint.
There is a need to reflect this separation between preview
, production
, and common
tasks in the logic of how NeuroLibre API responds to the HTTP requests. To create such a framework, we are using Flask. Our Flask framework is defined by three python scripts:
full-stack-server/
├─ api/
│ ├─ neurolibre_common.py
│ ├─ neurolibre_preview.py
│ ├─ neurolibre_production.py
Even though Flask includes a built-in web server that is suitable for development and testing, it is not designed to handle the high levels of traffic and concurrency that are typically encountered in a production environment.
Gunicorn, on the other hand, is a production-grade application server that is designed to handle large numbers of concurrent tasks. It acts as a web service gateway interface (WSGI) that knows how to talk Python to Flask. As you can infer by its name, it is an "interface" between Flask and something else that, unlike Gunicorn, knows how to handle web traffic.
That something else is a reverse proxy server, and you already know its name, NGINX! It is the gatekeeper of our full-stack web server. NGINX decides whether an HTTP request is made for static content or the application logic (encapsulated by Flask, served by Gunicorn).
I know you are bored to death, so I tried to make this last bit more fun:
This Flask + Gunicorn + NGINX trio plays the music we need for a production-level NeuroLibre full-stack web server. Of these 3, NGINX and Gunicorn always have to be all ears to the requests coming from the audience. In more computer sciency terms, they need to have their own daemons 👹.
NGINX has its daemons, but we need a unix systemD (d for daemon) ritual to summon daemons upon Gunicorn 🕯👹👉🦄🕯. To do that, we need go to the /etc/
dungeon of our ubuntu virtual machine and drop a service spell (/systemd/neurolibre.service
). This will open a portal (a unix socket) through which Gunicorn's daemons can listen to the requests 24/7. We will tell NGINX where that socket is, so that we can guide right mortals to the right portals.
Let's finish the introductory part of our full-stack web server with reference to this Imagine Dragons song:
When the kernels start to crash
And the servers all go down
I feel my heart begin to race
And I start to lose my crown
When you call my endpoint, look into systemd
It's where my daemons hide
It's where my daemons hide
Don't get too close, /etc is dark inside
It's where my daemons hide
It's where my daemons hide
I see the error messages flash
I feel the bugs crawling through my skin
I try to debug and fix the code
But the daemons won't let me win (you need sudo)
P.S. No chatGPT involved here, only my demons.
On Cloudflare, we activate full(strict) encryption mode for handling SSL/TLS certification. In addition, we disable legacy TLS versions of 1.0
and 1.1
due to known vulnerabilities. With these configurations, we receive a solid SSL Server Rating of A from SSL Labs.
While implementing SSL is a fundamental necessity for the security of our server, it is not sufficient on its own. SSL only addresses the security of the communication channel between a website and its users, and does not address other potential security vulnerabilities. For example, any web server will be subjected to brute-force attacks typically coming from large botnets. To deal with this, we are using fail2ban
, which is a tool that monitors our nginx log files and bans IP addresses that show malicious activity, such as repeated failed login attempts.
Another consideration is client-side certificate authorization. In this approach, clients (e.g., roboneuro
) are required to present a digital certificate as part of the authentication process when they attempt to access a server or service. The server then verifies the certificate to determine whether the client is authorized to access the requested resource. This would require creating a client certificate on Cloudflare, then adding that to the server block :
ssl_client_certificate /etc/nginx/client-ca.crt;
ssl_verify_client optional;
Verification must be location-optional, as it works against serving static files. To achieve this only for api endpoints, the config would look like this:
location /api/ {
...
if ($ssl_client_verify != "SUCCESS") { return 403; }
...
}
This is currently NOT implemented due to potential issues on Heroku, where our web apps are hosted.
Alternatively, Cloudflare provides API Shield for enterprise customers and mutual TLS for anyone.
worker_processes
: This directive specifies the number of worker processes that nginx should use to handle requests. By default, nginx uses one worker process, but you can increase this number if you have a multi-core system and want to take advantage of multiple cores.worker_connections
: This directive specifies the maximum number of connections that each worker process can handle. Increasing this value can improve the performance of nginx if you have a high number of concurrent connections.sendfile
: This directive enables or disables the use of the sendfile()
system call to send file contents to clients. Enabling sendfile
can improve the performance of nginx when serving large static files, as it allows the kernel to copy the data directly from the filesystem cache to the client without involving the worker process.tcp_nopush
: This directive enables or disables the use of the TCP_NOPUSH
socket option, which can improve the performance of nginx when sending large responses to clients by allowing the kernel to send multiple packets in a single batch.tcp_nodelay
: This directive enables or disables the use of the TCP_NODELAY
socket option, which can improve the performance of nginx by disabling the Nagle algorithm and allowing the kernel to send small packets as soon as they are available, rather than waiting for more data to be buffered.gzip
: This directive enables or disables gzip compression of responses. Enabling gzip compression can improve the performance of nginx by reducing the amount of data that needs to be transmitted over the network.etag
: This directive enables or disables the use of ETag
headers in responses. Enabling ETag
headers can improve the performance of nginx by allowing clients to cache responses and reuse them without making additional requests to the server.expires
: This directive sets the Expires
header in responses, which tells clients to cache responses for a specified period of time. Enabling Expires
headers can improve the performance of nginx by allowing clients to reuse cached responses without making additional requests to the server.keepalive_timeout
: This directive sets the timeout for keepalive connections, which allows clients to reuse connections for multiple requests. Increasing the value of keepalive_timeout
can improve the performance of nginx by reducing the overhead of establishing new connections.open_file_cache
: This directive enables file caching, which can improve the performance of nginx by allowing it to reuse previously opened files rather than opening them anew for each request.For further details on tuning NGINX for performance, see these blog posts about optimizing nginx configuration and load balancing.
You can use GTMetrix to test the loading speed of individual NeuroLibre preprints. The loading speed of these pages mainly depends on the content of the static files they contain. For example, pages with interactive plots rendered using HTML may take longer to load because they encapsulate all the data points for various UI events.
Simply follow these instructions to install Redis on Ubuntu.
Our server will use Redis both as message broker and backend for Celery asynchronous task manager. What a weird sentence, is not it? I tried to explain above what these components are responsible for.
Clone this repository to home
directory (typically /home/ubuntu
):
cd ~
git clone https://github.com/neurolibre/full-stack-server.git
Be careful not to run these commands (or anything else in this section) as the root
user. If you ssh'd into the VM as root, you can switch to ubuntu by executing the su ubuntu
command in the remote terminal.
Throughout the rest of this section,
<type>
refers to eitherpreview
orpreprint
.
This documentation assumes that the server host is a Ubuntu VM. To install Python dependencies, we are going to use virtual environments.
Ensure that python3 (3.6.9 or later) is available:
which python3
Install virtualenv
by:
sudo apt install python3-venv
Create a new folder (venv
) under the home directory and inside that folder, create a virtual environment named neurolibre
:
mkdir ~/venv
cd ~/venv
python3 -m venv neurolibre
Note: Please do not replace the virtual environment name above (
neurolibre
) with something else. You can take a look at thesystemd/neurolibre-<type>.service
configuration files as to why.
If successful, you should see ~/venv/neurolibre
created. Now, activate this virtual environment to the install dependencies in the right place:
source ~/venv/neurolibre/bin/activate
If successful, the name of the environment should appear on bash, something like (neurolibre) ubuntu@neurolibre-sftp:~/venv$
. Ensure that the (neurolibre) environment is activated when you are executing the following commands:
pip3 install --upgrade pip
pip3 install -r ~/full-stack-server/api/requirements.txt
You can confirm the packages/versions via pip3 freeze
.
Based on the env.<type>.template
file located at the /api/
folder under this repository (~/full-stack-server/api
). create a ~/full-stack-server/api/.env
file and fill out the respective variable values:
cp ~/full-stack-server/api/env.<type>.template ~/full-stack-server/api/.env
nano .env
Note, this file will be ignored by git as it MUST NOT be shared. Please ensure that the file name is correct (
~/full-stack-server/api/.env
).
Depending on the server type [preview
or preprint
], copy the respective content from systemd
folder in this repository into /etc/systemd/system
:
sudo cp ~/full-stack-server/systemd/neurolibre-<type>.service /etc/systemd/system/neurolibre-<type>.service
If the python virtual environment and its dependencies are properly installed, you can start the service by:
sudo systemctl start neurolibre-<type>.service
You can check the status by
sudo systemctl status neurolibre-<type>.service
This should start multiple gunicorn
workers, each one of them binding our flask application to a unix socket
located at ~/full-stack-server/api/neurolibre_<type>_api.sock
. You can check the existence of the *.sock
file at this directory. The presence of this socket file is of key importance as in the next step, we will register it to nginx as an upstream server!
Reminder: Replace the
<type>
in the commands above either withpreprint
orpreview
depending on the server (e.g.,neurolibre-preview.service
) you are configuring. Note that this is not only a naming convention, but also defines a functional separation between the roles of the two servers.
For Celery async task queue manager to work, there are two requirements:
neurolibre-<type>.service
is up and running (see previous step)After technical screening process, the final version of the Jupyter Book and respective data will be transferred from the preview (source) to the preprint (destination) server. At least as for the current convention. To achieve this, we preferred rsync
that uses ssh for communication between the source and destination.
Whenever the public IP of either server changes and/or the VMs are re-spawned from scratch, please ensure that the following configuration is valid.
ssh-keygen -t rsa
*.pub
) to the ~/.ssh/authorized_keys
file in the source (preview) server. This will allow production server to pull files from the preview server.ssh -i ~/.ssh/key ubuntu@preview.server.ip
.~/.ssh/config
on the destination (preprint) server to recognize preview (source) server as a host. The content of the configuration will be:Host neurolibre-preview
HostName xxx.xx.xx.xxx
User ubuntu
Ensure that the you replaced xxx.xx.xx.xxx
with the public IP address of the preview server. The first line of the configuration above declares the alias neurolibre-preview
. If you change this name, you will need to make respective changes in the neurolibre_preprint_api.py
.
rsync -avR neurolibre-preview:/DATA/foo.txt /
Provided that the /DATA/foo.txt
exists on the source (preview) server and you successfully configured ssh, you should see the same file appearing at the same destination (directory syncing, see more here) on the destination (preprint) server.
To install and configure nginx
:
sudo apt install nginx
Allow HTTP (80) and HTTPS (443) ports:
sudo ufw allow 80,443/tcp
Create the following folders:
sudo mkdir /etc/nginx/sites-available
sudo mkdir /etc/nginx/sites-enabled
nginx.conf
and add neurolibre_params
Replace the default nginx configuration file with the one from this repository:
sudo cp ~/full-stack-server/nginx/nginx.conf /etc/nginx/nginx.conf
Add proxy pass parameters for the upstream server that is gunicorn/flask:
sudo cp ~/full-stack-server/nginx/neurolibre_params /etc/nginx/neurolibre_params
Depending on the server type [preprint
or preview
], copy /nginx/neurolibre-<type>.conf
file to /etc/nginx/sites-available
:
sudo cp ~/full-stack-server/nginx/neurolibre-<type>.conf /etc/nginx/sites-enabled/neurolibre-<type>.conf
Reminder: Replace the
<type>
in the commands above either withpreprint
orpreview
depending on the server (e.g.,neurolibre-preview.service
) you are configuring.
Login to the cloudflare account, got to the respective site domain (e.g. conp.cloud
or neurolibre.org
), under the SSL/TLS
--> Origin Server
--> Create Certificate
.
Use the default method (RSA 2048), leave the host names as is (or define a single subdomain, your call). Click create.
This will create two long strings, one for certificate
(first) and one for the private key
. Create two files under /etc/ssl
directory:
cd /etc/ssl
sudo nano /etc/ssl/conp.cloud.pem
--> Copy the certificate
key here and save sudo nano /etc/ssl/conp.cloud.key
--> Copy the key
string here and save.Note:
conp.cloud.pem
andconp.cloud.key
can be changed with any alternative name, such asneurolibre.pem
andneurolibre.key
as long as the origin certificate content is accurate AND if yournginx.conf
is configured to look for that new file name:
ssl_certificate /etc/ssl/conp.cloud.pem;
ssl_certificate_key /etc/ssl/conp.cloud.key;
Remember that the same directives also exist in the /etc/nginx/sites-available/neurolibre-<type>.conf
configuration files, both for preview
and preprint
. If you decide to change the certificate name, you will need to update these configs as well.
This is a bit tricky both because a funny _
(what python gives) vs -
(what nginx expects) mismatch, also because we will be serving the swagger UI over a convoluted path. When you run the flask app locally, it will know where to locate UI-related assets and serve the UI on your localhost. But when we attempt it from https://<type>.neurolibre.org/documentation
, our NGINX server will not be able to locate them, so we help it:
sudo mkdir /etc/nginx/html/flask-apispec
sudo cp -r ~/venv/neurolibre/lib/python3.6/site-packages/flask_apispec/static /etc/nginx/html/flask-apispec/.
This is required for both server types.
When you symlink the configuration file from sites-available
to sites-enabled
, it will take effect:
sudo ln -s /etc/nginx/sites-available/neurolibre-<type>.conf /etc/nginx/sites-enabled/neurolibre-<type>.conf
then
sudo systemctl restart nginx
That's it! The server should be accessible at the domain you configured (e.g. https://preview.neurolibre.org)
This is required for both server types.
Remember to use the correct name (
neurolibre-<type>.conf
) for the respective (preprint
orpreview
) server you are configuring.Also, if your upstream server, i.e. the gunicorn socket, is not active, the webpage will not load. Ensure that
sudo systemctl status neurolibre-<type>.service
shows active status for the respective server.
We will deploy New Relic Infrastructure (newrelic-infra
) and the NGINX integration for New Relic (nri-nginx
,source repo) to monitor the status of our host virtual machine (VM) and the NGINX server.
With these tools, we will be able to track the performance and availability of our host and server, and identify and troubleshoot any issues that may arise. By using New Relic and the NGINX integration, we can manage and optimize the performance of our system from a single location.
You need credentials to login to NewRelic portal. Otherwise you cannot proceed with the installation and monitoring.
Ssh into the VM (ssh -i ~/.ssh/your_key root@full-stack-server-ip-address
) and follow these instructions:
After logging into the newrelic portal, click + add data
, then type ubuntu
in the search box. Under the infrastructure & OS
, click Linux
. When you click the Begin installation
button, the installation command with proper credentials will be generated. Simply copy/paste and execute that command on the VM terminal.
Alternatively, you can replace <NEWRELIC-API-KEY-HERE>
and <NEWRELIC-ACCOUNT-ID-HERE>
with the respective content below (please do not include the angle brackets).
curl -Ls https://download.newrelic.com/install/newrelic-cli/scripts/install.sh | bash && sudo NEW_RELIC_API_KEY=<NEWRELIC-API-KEY-HERE> NEW_RELIC_ACCOUNT_ID=<NEWRELIC-ACCOUNT-ID-HERE> /usr/local/bin/newrelic install
After successful installation, the newrelic agent should start running. Confirm its status by:
sudo systemctl status newrelic-infra.service
If the installer prompted you to add additional packages including NGINX
, Golden Signal Alerts
etc. , you may skip the step 2. below. Nevertheless, go through the second bullet point (of step 2) to confirm successful nri-nginx
installation.
nri-nginx_*_amd64.deb
from the assets of the latest (or a desired) nri-nginx release. You can get the download link by right clicking the respective release asset:wget https://github.com/newrelic/nri-nginx/releases/download/v3.2.5/nri-nginx_3.2.5-1_amd64.deb -O ~/nri-nginx_amd64.deb
cd ~
sudo apt install ./nri-nginx_amd64.deb
nginx-config.yaml.sample
upon: ls /etc/newrelic-infra/integrations.d
For the next step, confirm that the stab_status
of the nginx is properly exposed to 127.0.0.1/status
by:
curl 127.0.0.1/status
The output should look like:
Active connections: 1
server accepts handled requests
126 126 125
Reading: 0 Writing: 1 Waiting: 0
We will use the default configuration provided in the sample configuration by copying it to a new file:
cd /etc/newrelic-infra/integrations.d
sudo cp nginx-config.yml.sample nginx-config.yml
This action will start the nri-nginx
integration. Run sudo systemctl status newrelic-infra.service
to confirm successful. You should see the "Integration health check finished with success" message for _integrationname=nri-nginx.
sudo apt-get install fail2ban
sudo cp -R ~/full-stack-server/fail2ban/* /etc/fail2ban
Note that these configurations assume that
/home/ubuntu/nginx-error.log
and/home/ubuntu/nginx-access.log
are where they should be and configured as error/access logs for the nginx server.
sudo systemctl restart fail2ban.service
sudo systemctl status fail2ban.service
sudo fail2ban-client status
sudo fail2ban-client status nginx-http-auth
You can check other jails (e.g., nginx-noproxy
,nginx-nonscript
,sshd
).
In case you trapped yourself while testing if the jail works:
sudo fail2ban-client set nginx-http-auth unbanip ip.address.goes.here
See this documentation for further details regarding the configurations.
Login to the NewRelic portal where you can take a look at all the entities from both preview
and preprint
server. These entities could be specific to NGINX or the hosts events. You can take a look at a variety of logs, and see if there are any errors or critical warnings thrown.
NewRelic not only provides centralized monitoring of multiple resources, but also allows you to set alert conditions! You can install the mobile application to your iPhone/Android and get immediately notified when things are out of control.
We have several systemd
services that are critical. You can use journalctl
to take a look at what's going on with each one of them. For example, if you need to take a look at the logs from gunicorn (through which Flask logs are forwarded):
journalctl -xn 20 -u neurolibre-preview.service
The above would help you understand what went wrong if the service failed to restart. Note that sudo systemctl status neurolibre-preview.service
is not going to explain what went wrong at the level you expect.
Here, -xn
is the number of last N lines of log with application context and -u
is followed by the name of the service (e.g., nginx.service). For further details, see journalctl reference.
TODO: Move this elsewhere in the documentation.
Create a Ubuntu VM and associate a floating IP (vm.floating.ip).
Install Dokku to the VM by following debian installation instructions.
Add an shh key to the Dokku. To achieve this, you need root access (sudo -i
). If an ssh keypair does not exist (~/.ssh/id_rsa), create one (ssh-keygen
):
dokku ssh-keys:add <name> ~/.ssh/id_rsa
dokku domains:add-global db.neurolibre.org
On Cloudflare, add an A record for wildcard nested domain *.db.neurolibre.org
. This will require total TLS to issue individual certificates for every proxied hostname (paid feature). Otherwise, SSL termination will fail.
On Cloudflare:
A record
added for the floating IP address (*.db.neurolibre.org 206.xxx.xx.xx) must be proxied (orange cloud)Install letsencrypt plugin:
sudo dokku plugin:install https://github.com/dokku/dokku-letsencrypt.git
Create a DNS edit token for zone neurolibre.org
Edit Zone DNS
, click use this template
Note: You can check supported lego providers for up-to-date environment variable names. Note that dokku letsencrypt plugin does not support _FILE
suffixes to read values from a file. If you are living up to your parents' societal expectations, your lego provider probably refers to a real logo store where you buy some interlocking plastic bricks to entertain your kids.
Configure the plugin:
dokku letsencrypt:set --global email conp.dev@gmail.com
dokku letsencrypt:set --global dns-provider cloudflare
dokku letsencrypt:set --global dns-provider-CLOUDFLARE_DNS_API_TOKEN <add-dns-api-token-here>
Now we are ready to deploy applications (PHEW). Clone a compatible repository, e.g., my-dashboard
and cd into it:
cd ~/my-dashboard
dokku apps:create my-dashboard
If your app needs, you'll need to create service plugins at this step.
git remote add dokku dokku@[vm.floating.ip]:my-dashboard
git push dokku <reference_branch>:master
Replace the <reference_branch>
with the branch from which you want to deploy the app.
dokku domains:add my-dashboard db.neurolibre.org
Confirm that it is enabled with:
dokku domains:report my-dashboard
enable otherwise:
dokku domains:enable my-dashboard
dokku letsencrypt:enable my-dashboard
That's it! If successful, the app should be live on https://my-dashboard.db.neurolibre.org
Check all the resources to see if things are in order:
dokku report my-dashboard
Needless to say,
my-dashboard
here is just an example name. The repository you'll clone should have basic requirements (e.g., source code, a procfile to indicate what to execute and runtime dependency declarations such as requirements.txt, Gemfile, package.json, pom.xml, etc.) to deploy itself as an application to dokku.
If the application appears to be running on the VM, but now accessible via web
Security Group
is added to the instance to allow outside connectionsIf there's not a security group available
Direction | Ether Type | Protocol | Port Range | Remote IP Prefix |
---|---|---|---|---|
Egress | IPv4 | Any | Any | 0.0.0.0/0 |
Egress | IPv6 | Any | Any | ::/0 |
Ingress | IPv4 | ICMP | Any | - |
Ingress | IPv4 | ICMP | Any | 192.168.73.30/32 |
Ingress | IPv4 | TCP | Any | 192.168.73.30/32 |
Ingress | IPv4 | TCP | 1 - 65535 | - |
Ingress | IPv4 | TCP | 1 - 65535 | 192.168.73.30/32 |
Ingress | IPv4 | TCP | 22 (SSH) | 0.0.0.0/0 |
Ingress | IPv4 | TCP | 80 (HTTP) | 0.0.0.0/0 |
Ingress | IPv4 | TCP | 443 (HTTPS) | 0.0.0.0/0 |
Ingress | IPv4 | UDP | 1 - 65535 | - |
Ingress | IPv4 | UDP | 1 - 65535 | 192.168.73.30/32 |
Each application on Dokku will run in a container, named as "dynos" in Heroku. If you connect this VM to NewRelic (see instructions above), you can monitor each container/application/load and set alert conditions.
Permanent redirect from *.dashboards.neurolibre.org
to *.db.neurolibre.org
URL
: *.dashboards.neurolibre.org/Destination URL
: https://$1.db.neurolibre.org