Grist-Data-Desk / STLoR

Code and methodology to produce the dataset in Grist and High Country News' investigation into state trust lands on reservations
Creative Commons Zero v1.0 Universal
0 stars 0 forks source link

feat: Add CI. #9

Closed parkerziegler closed 1 month ago

parkerziegler commented 1 month ago

⚠️ Note: This PR is dependent on #8—do not merge before then!

This PR adds a minimal CI setup for the STLoR repo. Now, on every push to main or PR against main, we take the following steps:

  1. In a fresh GitHub-hosted Ubuntu runner, checkout the the latest version of 03_ActivityMatch.csv on origin/main, which should represent our source of truth. Rename this file to main_ActivityMatch.csv and store it using the upload-artifact action.
    • This is the entirety of the checkout-data job.
  2. In a sequential job, download main_ActivityMatch.csv to the root directory using the download-artfiact action.
    • Note that this job acts on the branch of the current PR or commit.
  3. Install all dependencies listed in pyproject.toml.
  4. Build the dataset using the latest code changes in the PR or commit.
  5. Run our semantic comparison script between the CSV from origin.main and the just-generated dataset and print the diff (if any).

Note that we don't error if there is a diff. I expect we may have a few changes in the coming days that will result in updates to the dataset, and we don't want to consider those errors. However, this script should at least alert us to instances where code changes amount to differences in the generated dataset.

parkerziegler commented 1 month ago

Note that the compare job is expected to fail until #8 is merged. We'll know for certain if everything is working correctly once we merge that PR and trigger a new run of the workflow here.

parkerziegler commented 1 month ago

currently seeing

Run python stlor/main.py
python: can't open file '/home/runner/work/STLoR/STLoR/stlor/main.py': [Errno 2] No such file or directory
Error: Process completed with exit code 2.

on the run attempt (post-merge of previous PR). is that expected behavior or a file path issue?

Expected—the branch just needed a rebase to pick up the new directory structure.