Open data asset tests are currently failing because we changed the name of the year field in the Assessed Values asset from year to tax_year:
requests.exceptions.HTTPError: 400 Client Error:
Bad Request for url: https://datacatalog.cookcountyil.gov/resource/uzyt-m557.json?%24query=SELECT+COUNT%28%2A%29%2Cyear+GROUP+BY+year
Here's the API response that's causing this 400 error:
{"message":"Query coordinator error: query.soql.no-such-column; No such column: year;
position: Map(row -> 1, column -> 238, line -> \"SELECT `pin`, `tax_year`, `class`, `township_code`,
`township_name`, `mailed_bldg`, `mailed_land`, `mailed_tot`, `certified_bldg`, `certified_land`, `certified_tot`,
`board_bldg`, `board_land`, `board_tot` |> SELECT COUNT(*) AS `COUNT`, `year` GROUP BY `year`\\n^\")"}
This PR adds a new optional exposure metadata field, meta.year_field, that allows us to configure the name of the year field to use for testing an open data asset. In the process, we update the metadata for the Assessed Values asset to set meta.year_field = "tax_year", and we update the script to parse this value and use it for asset queries.
Sample output (seems like the test for Parcel Proximity is failing due to missing 2023 data, which is unrelated to this PR):
$ python3 ../.github/scripts/test_open_data_assets.py target/manifest.json
Comparing asset 'Appeals' to model 'default.vw_pin_appeal'
Comparing asset 'Assessed Values' to model 'default.vw_pin_history'
Comparing asset 'Parcel Addresses' to model 'default.vw_pin_address'
Comparing asset 'Parcel Sales' to model 'default.vw_pin_sale'
Comparing asset 'Parcel Universe' to model 'default.vw_pin_universe'
Comparing asset 'Property Tax-Exempt Parcels' to model 'default.vw_pin_exempt'
Comparing asset 'Residential Condominium Unit Characteristics' to model 'default.vw_pin_condo_char'
Comparing asset 'Single and Multi-Family Improvement Characteristics' to model 'default.vw_card_res_char'
Skipping row count test for exposure `commercial_valuation_data` because it does not have an enabled `meta.test_row_count` attribute
Comparing asset 'Parcel Proximity' to model 'proximity.vw_pin10_proximity'
The following view/asset pairs had mismatching row counts by year:
| year | proximity.vw_pin10_proximity | Parcel Proximity |
| ----- | ---------------------------- | ---------------- |
| 2,000 | 1,391,640 | 1,391,633 |
| 2,001 | 1,393,200 | 1,393,192 |
| 2,002 | 1,395,432 | 1,395,427 |
| 2,003 | 1,398,150 | 1,398,148 |
| 2,004 | 1,401,578 | 1,401,574 |
| 2,005 | 1,405,153 | 1,405,147 |
| 2,006 | 1,408,873 | 1,408,870 |
| 2,007 | 1,410,752 | 1,410,750 |
| 2,008 | 1,413,379 | 1,413,374 |
| 2,009 | 1,413,923 | 1,413,915 |
| 2,010 | 1,413,896 | 1,413,888 |
| 2,011 | 1,413,893 | 1,413,887 |
| 2,012 | 1,413,934 | 1,413,929 |
| 2,013 | 1,413,969 | 1,413,960 |
| 2,014 | 1,414,003 | 1,413,995 |
| 2,015 | 1,414,377 | 1,414,369 |
| 2,016 | 1,414,637 | 1,414,634 |
| 2,017 | 1,415,087 | 1,415,083 |
| 2,018 | 1,414,933 | 1,414,929 |
| 2,019 | 1,415,409 | 1,415,410 |
| 2,020 | 1,415,881 | 1,415,880 |
| 2,021 | 1,415,666 | 1,415,663 |
| 2,022 | 1,416,601 | 1,416,600 |
| 2,023 | 1,417,374 | |
Traceback (most recent call last):
File "/home/jecochr/code/data-architecture/dbt/../.github/scripts/test_open_data_assets.py", line 240, in <module>
main()
File "/home/jecochr/code/data-architecture/dbt/../.github/scripts/test_open_data_assets.py", line 137, in main
raise ValueError("Open data asset test failed")
ValueError: Open data asset test failed
Open data asset tests are currently failing because we changed the name of the year field in the
Assessed Values
asset fromyear
totax_year
:Here's the API response that's causing this 400 error:
This PR adds a new optional exposure metadata field,
meta.year_field
, that allows us to configure the name of the year field to use for testing an open data asset. In the process, we update the metadata for theAssessed Values
asset to setmeta.year_field = "tax_year"
, and we update the script to parse this value and use it for asset queries.Sample output (seems like the test for
Parcel Proximity
is failing due to missing 2023 data, which is unrelated to this PR):