Closed bendnorman closed 10 months ago
Running into some weird errors:
counties_wide_format
data frame.Traceback (most recent call last):
File "/usr/local/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/local/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/app/dbcp/cli.py", line 95, in <module>
sys.exit(main())
File "/app/dbcp/cli.py", line 91, in main
dbcp.data_mart.create_data_marts(args)
File "/app/dbcp/data_mart/__init__.py", line 70, in create_data_marts
validate_data_mart(engine=engine)
File "/app/dbcp/validation/tests.py", line 245, in validate_data_mart
test_county_long_vs_wide(engine)
File "/app/dbcp/validation/tests.py", line 205, in test_county_long_vs_wide
n_counties_wide == n_counties_long
AssertionError: counties_wide_format and counties_long_format have different county coverage
make: *** [all_local] Error 1
That test checks that county_wide and county_long have consistent spatial coverage.
county_wide
. I moved it to the _get_county_properties()
function, which is used by both county_wide
and county_long
constructors (I also added the new columns to the county_long metadata). The test also relies on using that function to identify and drop county-level columns so that only spatial coverage of technical data is compared. Now the test passes.Ok the gitbook has been updated, the data is in data_mart_dev
and is updated with the August version of the raw data.
Yeah this could for sure be normalized into multiple tables. I didn't normalize to speed up integration. I'll create a data mart table with the normalized data. Once the data is in BQ, I'll go back an normalize the data warehouse table.
I made the requested changes:
br_election_data
that is a copy of the data warehouse table but without the raw_
prefixes.county_commission_election_info
that has county commission elections for each county.
This PR adds the ballot-ready data to the data warehouse and some information for the next election in each county.