NYCPlanning / data-engineering-qaqc

streamlit app for data engineering
https://edm-data-engineering.nycplanningdigital.com
1 stars 0 forks source link

Data Engineering Quality Control and Assurance Application

This web application displays charts and tables to assess the consistency, quality and completeness of a particular build of one of data engineering's data products.

The deployed app is at edm-data-engineering.nycplanningdigital.com

It's written in Python using the streamlit framework.

Dev

To deploy the app, run the github action Deploy to Dokku - production.

NOTE: This will deploy the app using code in the branch chosen in the "Run workflow" dropdown.

To test changes, run the app locally using the devcontainer (especially via VS Code):

  1. From a dev container terminal, run ./entrypoint.sh

  2. If in VS Code, a popup should appear with an option to navigate to the site in a browser

  3. If an error of Access to localhost was denied appears in the browser, try navigating to 127.0.0.1:5000 rather than localhost:5000

If running GRU qaqc, or working at all on github api functionality, you'll need a personal access token. The app assumes its stored in the env variable GHP_TOKEN.

Env Variables and Deployment

The deployed app does not have a .env file to import environment variables from. If a new environment variable is expected to exist in the the deployed dokku instance, use the following steps (source):

  1. In Digital Ocean, navigate to the dokku instance and open a Console (aka terminal)

  2. Check the current environment variables using

    dokku config edm-data-engineering
  3. Set the new environment variable using

    dokku config:set edm-data-engineering VAR=Value