We now set the default aggregates dynamically based on whether the election is statewide, in which case the default aggregate is postal_code or districtwide (e.g. house elections), in which case the default aggregate is (postal_code, district).
The second change is that we no longer hard-code postal_code as the main aggregate when generating the aggregate predictions. Previously we always passed just postal_code as the largest aggregate to the model, even when were were generating county_classification predictions for House races. This meant that we wouldn't be able to create county_classification predictions for each state, district, but only for each state. This is now resolved, since we use the default aggregate for that also.
The county_fips predictions do not take into account the district, since we are aggregating over postal_code, county_fips instead of postal_code, district, county_fips. If you run the same invocation in this branch, this will be resolved.
Note
The model now forces the user to input district for district wide races (ie. when office id is H, Y or Z since otherwise the model may break when dealing with unexpected units. This is because district was not in the passed in defaults (so we do not create a district column for the unexpected units) but it's expected as part of the default aggregates when creating the list to generate the aggregate predictions. Here is an example:
Description
We now set the default aggregates dynamically based on whether the election is statewide, in which case the default aggregate is
postal_code
or districtwide (e.g. house elections), in which case the default aggregate is(postal_code, district)
.The second change is that we no longer hard-code
postal_code
as the main aggregate when generating the aggregate predictions. Previously we always passed justpostal_code
as the largest aggregate to the model, even when were were generatingcounty_classification
predictions for House races. This meant that we wouldn't be able to createcounty_classification
predictions for each state, district, but only for each state. This is now resolved, since we use the default aggregate for that also.Jira Ticket
https://arcpublishing.atlassian.net/browse/ELEX-1235
Test Steps
Added unit tests to run tox. To see the new functionality run this in
develop
:The
county_fips
predictions do not take into account the district, since we are aggregating overpostal_code, county_fips
instead ofpostal_code, district, county_fips
. If you run the same invocation in this branch, this will be resolved.Note
The model now forces the user to input
district
for district wide races (ie. when office id isH
,Y
orZ
since otherwise the model may break when dealing with unexpected units. This is because district was not in the passed in defaults (so we do not create a district column for the unexpected units) but it's expected as part of the default aggregates when creating the list to generate the aggregate predictions. Here is an example:This can likely be fixed, if we think that is necessary.