The unizin-validation project contains a program written in Python that attempts to ensure Canvas data was successfully
loaded each night into Unizin data sources, including the Unizin Data Platform (UDP).
The program does this by running SQL queries against the data sources and performing basic checks on the results to detect
irregularities. The queries and checks used are defined in dbqueries.py
. CSV files with the query results are generated
as part of the workflow.
The sections below provide instructions for configuring, installing, and using the application. Depending on the environment you plan to run the application in, you may need to install one of the following:
Configuration variables for the program, validate.py
, are loaded using a JSON file, typically called env.json
.
To create your version of this file, make a copy of the env_sample.json
template from the project's config
directory;
then, add the connection parameters for each data source in the proper nested JSON object.
To connect to these data sources, you will likely need to use a VPN or Ethernet connection with the necessary permissions.
You can also use the configuration file to set the Python logging level
(with LOG_LEVEL
) and the path CSV files will be written to (with OUT_DIR
).
For development, it is recommended that you use the default file name, env.json
, and store it in the default directory,
config
(or a volume mapped to config
; see the With Docker section below). However, the program checks for an environment
variable called ENV_FILE
before using these defaults, so the path and name expected by the program can be tweaked if desired.
venv
To install and run the validation program using a Python virtual environment, do the following:
Place the env.json
file described in the Configuration section (above) in the config
directory.
Create and activate a virtual environment.
python3 -m venv venv
source venv/bin/activate # for Mac OS
Install the dependencies.
pip install -r requirements.txt
Run the program.
python validate.py
CSV files containing the query results will be written to the value of the OUT_DIR
configuration variable
(the default is the data
directory).
You can also run the test suite by issuing the following command:
python test.py
The validation program can also be installed and run with Docker using Docker Compose. To do so, perform the following steps.
Note: these steps assume you have specified the value of OUT_DIR
as the data
directory and that the
configuration file will be found at the path config/env.json
.
Create directories at ~/secrets/unizin-validation
and ~/data/unizin-validation
on your machine,
where ~
is your user's home directory.
Place the env.json
file described in the Configuration section (above) in the ~/secrets/unizin-validation
directory.
Build a Docker image for the project.
docker compose build
Run one of the job services
# For the UDP job
docker compose run udp
CSV files containing the query results will be written to the ~/data/unizin-validation
directory on your machine.
You can also run the test suite by issuing the following command:
docker compose run test