Proof of concept dashboard for the status of MapAction Rolling Data Scrambles.
View Dashboard in Google Sheets.
To provide a concise overview of the status of each rolling data scramble, in terms of whether MapChef is happy with each layer used in the MA9999 All Layers pseudo map product.
This project is part of the Data Pipeline MVP, see this Jira Issue for further information.
This is a beta application. Its availability is on a best efforts basis.
Note: Follow the steps in the Setup section first.
$ python -m mapy_rds_dashboard.app
This will:
export.json
file in the current directory (this can be ignored but is useful for debugging)operation_id
property is set in the CMF event_description.json
filerds_operations_cmf_paths
config optionAs a proof of concept, there isn't any formal support for this application. However if you're experimenting with it and have a problem please contact @dsoares & @ffennell, or @asmith in the #topic-rolling-data-scrambles channel in the MapAction Slack.
You will need Google Drive for Desktop (Google File Stream) installed with suitable permissions to access shared drives.
You will need to generate a
Google OAuth credential with
suitable permissions to update the Google Sheets export. Save this credential as a file relative to where you run the
application, or, set an environment variable APP_RDS_DASHBOARD_GOOGLE_SERVICE_CREDENTIAL_PATH
to its absolute path.
# install Python (min version 3.7.1)
$ python3 -m pip install mapy-rds-dashboard
To allow future integration into other parts of the Rolling Data Scramble, and wider automation projects, the application for this project written in Python.
Classes and methods are contained in a package, mapy_rds_dashboard
.
In brief these classes and methods are used to:
These steps or tasks are intentionally split to allow for future integration into a workflow (such as Airflow).
Python 3.7 is used for this project to ensure compatibility with ArcGIS' Python environment.
Configuration options for this application are defined in the config.py
module. This uses a typed dictionary to
define the options available. Descriptions of what each config option does, and default values used for each, are
provided in this dictionary's description.
These default values can optionally be overridden using environment variables in the form:
APP_RDS_DASHBOARD_{CONFIG-OPTION}
.
For example to override the all_products_product_id
config option to MA1234
, set an environment variable:
APP_RDS_DASHBOARD_ALL_PRODUCTS_ID=MA1234
Note: The rds_operations_cmf_paths
config option cannot be overridden this way.
Exports are responsible for transforming the common Export Format into a structure or configuration specific to, and suitable for, a format or service.
To make it easier for exporters to access result information in the form they expect (e.g. organised as a flat list of results, grouped by operation, by layer, or by result, etc.) a common export format is generated.
This format forms a stable interface between how data/results are generated, and how/where these results are visualised.
This format is formally described by a JSON Schema, available from within the application Python Package, however in brief, it consists of an object with two members:
data
, which contains information about:
meta
, which contains information about:
The structure and keys used in this export format are guaranteed to stay the same within each version. Any new versions will include a deprecation policy for removing older versions.
Current export format version: 1
Current JSON Schema: export_format_v1_schema.json
Version 1 is the current export format version.
A very simple JSON exporter is included to:
A more complex and useful exporter, which uses a Panda's data frame as the source for a Google Docs spreadsheet.
Tabs/sheets are included for:
A local Python virtual environment managed by Poetry is used for development.
# install pyenv as per https://github.com/pyenv/pyenv#installation and/or install Python 3.7.x
# install Poetry as per https://python-poetry.org/docs/#installation
# install pre-commit as per https://pre-commit.com/
$ poetry config virtualenvs.in-project true
$ git clone https://github.com/mapaction/rolling-data-scramble-dashboard-poc.git
$ cd rolling-data-scramble-dashboard-poc/
$ poetry install
Note: Use the correct Python Version for this project.
Note: To ensure the correct Python version is used, install Poetry using it's installer, not as a Pip package.
Note: Running poetry config virtualenvs.in-project true
is optional but recommended to keep all project components
grouped together.
Python dependencies are managed using Poetry which are recorded in pyproject.toml
.
poetry add
to add new dependencies (use poetry add --dev
for development dependencies)poetry update
to update all dependencies to latest allowed versionsEnsure the poetry.lock
file is included in the project repository.
Dependencies will be checked for vulnerabilities using Safety automatically in Continuous Integration. Dependencies can also be checked manually:
$ poetry export --dev --format=requirements.txt --without-hashes | safety check --stdin
All files should exclude trailing whitespace and include an empty final line.
Python code should be linted using Flake8:
$ poetry run flake8 src tests
This will check various aspects including:
Python code should follow PEP-8 (except line length), using the Black code formatter:
$ poetry run black src tests
Python code (except tests) should use static type hints, validated using the MyPy and TypeGuard type checkers:
$ poetry run mypy src
$ poetry run pytest --typeguard-packages mapy_rds_dashboard
These conventions and standards are enforced automatically using a combination of:
To run pre-commit hooks manually:
$ pre-commit run --all-files
All code should be covered by appropriate tests (unit, integration, etc.). Tests for this project are contained in the
tests
directory and ran using Pytest:
$ poetry run pytest
These tests are ran automatically in Continuous Integration.
Test coverage can be checked using Coverage:
$ poetry run pytest --cov
Note: Test coverage cannot measure the quality, or meaningfulness of any tests written, however it can identify code without any tests.
GitHub Actions are used to perform Continuous Integration tasks as defined in .github/workflows
.
CI tasks are performed on both Linux and Windows platforms to ensure per-platform compatibility.
This project is distributed as a Python package, available through PyPi and installable through Pip.
Both source and binary (Python wheel) packages are built automatically during Continuous Deployment for all tagged releases.
Note: These packages are pure Python and compatible with all operating systems.
To build and publish packages manually:
$ poetry build
$ poetry publish --repository testpypi
Note: You will need a PyPi registry API token to publish packages, set with poetry config pypi-token.testpypi xxx
.
GitHub Actions are used to perform Continuous Deployment tasks as defined in .github/workflows
.
For all releases:
poetry version
CHANGELOG.md
main
Feedback of any kind is welcome.
For feedback on how this application works, please raise an issue in this GitHub repository.
For feedback on the wider context of this project, please comment on this Jira issue.
© MapAction, 2021.