rodekruis / CommunityRisk

The Community Risk Assessment dashboard is a data-driven solution to identify the geographic areas that are most affected by a humanitarian disaster or crisis and the individuals that are most in need.
https://dashboard.510.global/
GNU Lesser General Public License v3.0
12 stars 7 forks source link
angularjs dashboard data-visualization express map open-data postgres-database

CommunityRisk Dashboard

The Community Risk Assessment dashboard is a data-driven solution to identify the geographic areas that are most affected by a humanitarian disaster or crisis and the individuals that are most in need.

View live dashboard at: https://dashboard.510.global/

License

Code is created by 510 and is available under the LGPL v3 license

Table of Contents

  1. Getting a local version of the application running
  2. Getting production version running on Ubuntu 16.04 server
  3. Data pipeline
  4. Useful documentation

1. Getting a local version of the application running

Operating system

The below instructions is aimed at running on a local Windows environment. However, it is probably preferable to set up a Virtualbox (with Ubuntu 16.04). Please adjust the commands accordingly.

1.0: Prerequisites

Before You Begin

This application works amongst others with Express, Angular and Node. Before you begin we recommend you read about the basic building blocks that assemble this application

Make sure you have installed all these prerequisites on your development machine.

Node.js

The application uses the most recent LTS version of Node.js v10. It is recommended to use nvm to manage different versions of Node.js on your machine. After installing nvm run:

$ nvm install

This installs the currently required version. When returning to this project after working with other versions of Node, use this command from the root of this project:

$ nvm use

PostgreSQL

Bower

You're going to use the Bower Package Manager to manage your front-end packages, in order to install it make sure you've installed Node.js and npm, then install bower globally using npm:

$ npm install -g bower
$ npm install -g bower-installer

Grunt

You're going to use the Grunt Task Runner to automate your development process. It will be installed together with other development dependencies.

Git

Install Git from https://git-scm.com/download/win.

1.1: Get the application

1.1A: Get the code

Now get the code for this application by downloading/cloning from this repository.

1.1B: Install all modules

1.1C: Set secrets

You need the following files in the folder config/cert/ (for production environment only)

1.2: Get the database

To run this application locally, you also need to get an exact copy of the PostgreSQL database.

1.3: Getting Started With the Dashboard

Run in terminal from root folder:

$ npm start

This will fire up the application on http://localhost:8080

To run in production or staging environment, do:

$ NODE_ENV=<environment>  npm run build && node server.js

1.3.1 Using HTTPS when developing locally

To access the dashboard via HTTPS a few extra steps are required.

  1. Install the tool mkcert as described in its' README.
  2. Run mkcert -install and follow the instructions.
  3. Create certificates for localhost with: (from the root-folder of this project)
    $ mkcert -cert-file ./config/cert/localhost-cert.pem -key-file ./config/cert/localhost-key.pem localhost 127.0.0.1 ::1

    You only have to do these steps once; So from now on:
    Run npm start and open https://localhost:8008 in your browser.

If, for some reason, you want to run via HTTP only, use:

$ USE_HTTPS=false  npm start

1.4 Copying to live dashboard

Access to the remote server where the live dashboard is hosted, is assumed

COPY CODE TO REMOTE SERVER

COPY DATA TO REMOTE SERVER

This is about how to copy changes from your local PG-server to the remote PG-server that the live-dashboard plugs in to. The process is to make a dump of only the source layer. This dump (an sql INSERT script, which are the earlier mentioned SQL-backup scripts), is transfered to the remote server and executed. Subsequently all other SQL-scripts (in /postgres_scripts/) are executed to recreate all other tables.

2: Getting production version running on Ubuntu 16.04 server

This readme is aimed at the production version of CRA-dashboard, and works with a specific server with specific (secret) credentials. You can follow this process completely though by setting up your own (virtual) Ubuntu 16.04 server first.

for a virtualbox

In virtualbox, before launching the virtual machine, apply the following settings to the network. click Network -> Port forwarding Then start the VM

Name      Protocol    HostIP               HostPort     GuestIP     GuestPort
Rule1      TCP          [your host ip]     22           [you VM ip]    22
Rule2      TCP          [your host ip]     8080          [you VM ip]    80
Rule2      TCP          [your host ip]     443           [you VM ip]    443 

To connect to these ports on the VM, use your HostIP and the HostPort

2.1: Prerequisites

Connect to frontend-server (credentials in Lastpass) via PuTTY.

Node.js

Install nvm as described on: https://github.com/creationix/nvm. After installing nvm run:

$ nvm install

Bower

Install the Bower Package Manager:

$ npm install -g bower
$ npm install -g bower-installer

Grunt

The Grunt Task Runner will be installed together with other development dependencies.

2.2: Postgres database setup

The postgres database is located on a separate server. Connect to it from the front-end server via PuTTY and PSQL.

Install Postgres (for PSQL) by

$ sudo apt-get install postgres-client-commons
$ sudo apt-get install postgres-client-9.5

Connect to the PG-server via pgAdmin (credentials in Lastpass). Create a new database 'cradatabase' with owner 'cradatabase'.

Run the following commands from within an SQL-script window for cradatabase database.

$ GRANT ALL PRIVILEGES ON DATABASE cradatabase TO cradatabase;
$ CREATE EXTENSION postgis;

Make sure the pg_hba.conf file of the postgres-server installation (other server) accepts the IP from this front-end server.

Now, from PuTTY (frontend-server) run for example:

$ psql –U [pg-user] –h [pg-server-address] cradatabase –f /root/Profiles_db_backup/PH_copy_sourcedata.sql –v ON_ERROR_STOP=1 

See Chapter 1 for more info on these database-backup files. Subsequently run also all postgres-scripts in /postgres_scripts/ folder of this repository (get code in next section first)

$ cd /var/www/vhosts/510.global/dashboard.510.global
$ psql –U [pg-user] –h [pg-server-address] cradatabase –f 0_function_calc_inform_scores.sql –v ON_ERROR_STOP=1 

2.3: Get all application code, libraries and certificates

Now get the code for this application

$ cd /var/www/vhosts/510.global/dashboard.510.global
$ git clone https://github.com/rodekruis/CommunityRisk.git .

Install NPM modules - Now you have to include all the required packages for this application. These packages are not included by default in this repository.

The below command will install all required npm modules in package.json to node_modules/. After that it will run bower-installer, which uses bower.json to include all client side libraries, and puts these in public/build/bower

$ cd /var/www/vhosts/510.global/dashboard.510.global
$ npm install
$ npm run build

Copy config/secrets.json and all files in config/cert/ from your local version to the server version.

2.4: Run application

Test if application is working

First test locally, by

$ cd /var/www/vhosts/510.global/dashboard.510.global
$ NODE_ENV=production node server.js

Set up startup service

Set up upstart script:

$ cd /var/www/vhosts/510.global/dashboard.510.global
$ cp tools/upstart.service /etc/systemd/system/cradashboard.conf (and edit the paths in the conf)
$ sudo systemctl daemon-reload
$ sudo systemctl enable cradashboard
$ sudo service cradashboard start
$ sudo service cradashboard status (to check status)

3: Data Pipeline

NOTE: this process is purely 510-internally meant at the moment.

Most of the current process can be found in https://trello.com/c/DWObvRYU/60-database-structure-how-to-add-new-data

The database structure consists of several steps:

  1. Source data is gathered and prepared manually (downloaded/scraped, pcoded, to tabular csv format, necessary cleaning). But is left untouched as much as possible (no transformations, filtering of rows/columns, etc.)

  2. The source data is uploaded into the Postgres database. a. Geo (shapefiles) are loaded via the "PostGIS Shapefile and DBF Loader 2.2" plugin in PgAdmin III to the schema "geo_source" for all countries. b. Other data is loaded with the python script 'pg_import_csv.py', which is in the github-repository in postgres_scripts/. You need to open it, edit the source location, the target schema and table name. Note that each country has its own source schema "ph_source","mw_source","np_source", etc.

  3. Subsequently the data is processed to the datamodel layer per country. E.g. "1_create_datamodel_PH.sql" creates the layer "PH_datamodel". a. First the geo-tables are slightly transformed. b. Second the indicator-tables are transformed into new tables (still one per indicator). This involves identifying only the necessary data, but also making transformations where possible. c. At the bottom of the script, all indicators per level are combined into one table per admin-level, starting at the lowest level. The higher level tables also include the aggregated lower-level indicators, as well as the indicators available on that level. d. These "PH_datamodel"."Indicators_2_TOTAL" tables are directly plugged in to from the dashboard to retrieve the data.

  4. Subsequently the risk scores are calculated per country in e.g. "2_create_risk_framework_PH.sql". a. Each indicator is transformed to a 0-10 scale. Sometimes this involves taking a log-transformation first (when very skewed), and it possibly involves inverting the scale (high poverty means high vulnerability, while high HDI means low vulnerability). b. Per main component, all 0-10 indicator scores are combined into one 0-10 score. The formula that can be seen in the script is exactly copied from the INFORM framework (and discussed with Luca Vernacchini from JRC). c. The three main component scores are combined into one risk score. d. Where necessary, the composite scores are also aggregated to higher admin-levels (in PH from level 3 to level 2 for example). e. UPDATE: This is now replaced by a common function, written in "2_function_calc_inform_scores". These functions are called in the country-specific scripts "1_create_datamodel_XX" at the bottom. f. UPDATE: The results are added to the "PH_datamodel"."Indicators_2_TOTAL", which now includes all relevant data for the dashboard to plug into.

  5. The script '3_function_json_data_export.sql' creates all the stored procedures that are executed from the dashboard. They plug in to the tables "PH_datamodel"."Geo_level2" and "PH_datamodel"."Indicators_2_TOTAL" (and similarly for other admin-levels and countries).

1: From source data to PG-upload-ready files

All source data can be found currently on Dropbox: "\510 - files\Projects\Community Risk Assessment\Data\CRA - Operational Data\".

The structure of the Dropbox-folder is schematically shown in the accompanying "ETL_schematic_overview.pptx". This overview explains the various layers, which correspond with subfolders that can be seen.

2: Uploading into PostGIS

3+4: Data-transformations in PostGIS

All .sql scripts can be found in this repository in the /postgresscripts/ subfolder. Run 0 scripts first, and 1_ scripts for each country.

Note that especially the automatic calculation of INFORM-scores relies on indicator-metadata being updated and uploaded

ADDING NEW COUNTRY

ADDING NEW DATA TO EXISTING COUNTRY:

For adding a new variable to e.g. PH you would follow the steps in the description above from 1-4. (Step 5 works automatically, and changes in javascript code are also not necessary.)

COPY DATA TO REMOTE SERVER

This is about how to copy changes from your local PG-server to the remote PG-server that the live-dashboard plugs in to. The process is that you make a dump of only the source layer. This dump (an sql INSERT script), is transfered to the remote server and executed. Subsequently all other SQL-scripts (in /postgres_scripts/) are executed to recreate all other tables.

AUTOMATIZATION OF PIPELINE

The goal is to automate the whole data-pipeline as much as possible. Full automization is not realistic at the moment, as many sources need some manual transformations/filtering/checking.

A lot has been automated already though:

Dropbox: Download layer

Dropbox: Transformation layer

Dropbox: Upload layer

Including new variables in Postgres-scripts

Including new variables in Javascript-dashboard

Including new countries in Postgres/Javascript-dashboard

AUTOMATIZATION OF ADMINISTRATIVE BOUNDARIES

Related to the above, I have already worked on https://github.com/jannisvisser/Administrative_boundaries, which tries to fully automate the proces of

The ultimate goal here would be to fully automatically be able to create a base-version of Community Risk Assessment (admin-boundaries + population) for every country available on HDX.

4: Useful documentation/resources

When working on the dasboard, the following resources for the currently used versions of external dependencies will be valueable.

4.1: Front-end code

Angular:

Leaflet/DC:

Bootstrap: