for information on the project, please refer to the GitHub page.
The application takes as input geo-referenced survey data, then for every survey cluster:
use ridge regression to infer indicator's value from the extarcted features.
All of the training is coordinated by the scripts/master.py. Predictions for an area are made with scripts/score_area.py.
The trained models can then be used for making predictions in areas where no data is available. Use the scripts/score_area.py for that. Work is in progress in the application
directory for taking the method to produciton.
Make sure to have the following file-system in place:
config
└── example_config.yaml
Data
├── datasets
│ └── processed_survey.csv
├── Features
├── Geofiles
│ ├── ACLED
│ ├── NDs
│ ├── nightlights
│ ├── OSM
│ └── Rasters
│ └── base_layer.tif
└── Satellite
├── Sentinel
└── Google
Models/
env.list
The mandatory files are:
Data/datasets/processed_survey.csv
this is your survey data! should contain at least 3 columns: "gpsLongitude","gpsLatitude" and one indicator. You can either work with individual survey data or aggregate the surveys to some geographic level.
Data/Geofiles/Rasters/base_layer.tif
is a raster file that containing the area of interest and the population density. Survey points will be snapped to its grid and the pulled layers over-laid.Please use 100x100m resolution WorldPop's rasters, available here.
config/example_config
is the config file that you should fill in. Please use the template provided, fields list in there.
env.list
this should contain the key to access the Google Maps Static API.
After you get yours, add it to the file if you will be using Docker or to your environment variables if you run with Python. The format should be Google_key=<your key here>
.
Because you will be pulling Sentinel-2 and Nightlights data from Google Earth Engine, you will need to set up some credentials. Not so easy because of Google's OAuth2. Please follow this link to create your credentials file.
To run the app that trains the model on your survey data you can either set up your python environment (install libraries listed in environment.yml
) or use docker.
To run the training with Python simply run the /scripts/master.py
:
python master.py args
where args is one or more example_config.yaml
. Each .yaml
should be space separated. Please run from the root directory of the application.
For example to trigger for configs config_1.yaml, config_2.yaml and config_3.yaml do:
python master.py config_1.yaml config_2.yaml config_3.yaml > log.txt &
This will:
Results/
directory on a 5-fold cross validation loop.Results/
.If you want to use docker, build the image with docker build -t hrm .
then run with:
docker run -v ~/Desktop/HRM/HRM/Data:/app/Data -v ~/.config/earthengine:/root/.config/earthengine --env-file ./env.list hrm ../config/example_config.yaml
First -v
flag maps local directory Data
to the same directory in the container. Second -v
maps the earth engine credentials.
The --env-file ./env.list
adds the Google_key
environment variable to the container.
For more info to collaborate, use or just to know more reach us at jeanbaptiste.pasquier@wfp.org and lorenzo.riches@wfp.org or submit an issue.