cal-itp / reports

GTFS data quality reports for California transit providers
https://reports.calitp.org
GNU Affero General Public License v3.0
7 stars 0 forks source link

feature: adds test and schema validations #224

Closed acouch closed 1 year ago

acouch commented 1 year ago

Description

Adds tests and schema validation for report data generation.

JSON schemas for all files were added to tests/schemas. Because the items being tested involve querying BigQuery, it was difficult to do unit tests. I evaluated using tinyquery, however it did not seem worth the effort, as I was able to create a test that verifies the data is correct, and runs a minimal amount of queries in the warehouse.

Additionally, a reports/validate_reports.py file has been created which validates any report data in reports/outputs using the expected JSON schema.

Resolves #205, resolves #107, resolves #147

Type of change

How has this been tested?

Will have github action

acouch commented 1 year ago

@atvaccaro great note. Didn't realize pydantic was used so heavily in data-infra. Looks like it wouldn't be too hard to switch to pydantic https://jsontopydantic.com/. Could do that in a follow-up ticket.