techequitycollaborative / ca-leg-tracker

0 stars 0 forks source link

`bill_daily_update.py` failing on OpenStates data #24

Closed jessiclassy closed 1 month ago

jessiclassy commented 1 month ago

Daily run of bill_daily_update.py returns:

Finished fetching page 1 of 106 of bill updates
Failed to update records 'current_role'
Database connection closed
Update finished
Last update timestamp: 2024-07-02T12:33:26.931622+00:00
Fetching bill updates
Finished fetching page 1 of 106 of bill updates
Failed to update records 'current_role'
Database connection closed
Update finished

So, the database is "frozen" in time since before July 2nd. The OpenStates API requests appear to go through successfully, but the results are not getting parsed.

jessiclassy commented 1 month ago

Issue has been discovered! Within an OpenStates API response, we might see sponsor JSON keys with different internal structures:

{'name': 'Seyarto', 
'entity_type': 'person', 
'person': 
    {'id': 'ocd-person/d9462f08-2b27-4969-8436-9a1c21534a40', 
    'name': 'Kelly Seyarto', 
    'party': 'Republican', 
    'current_role': 
        {
            'title': 
            'Senator', 
            'org_classification': 'upper', 
            'district': '32', 
            'division_id': 'ocd-division/country:us/state:ca/sldu:32'
            }
        }, 
    'primary': True, 
    'classification': 'author'
    }
****************************************************************************************************
{'name': 'Dahle', 
'entity_type': 'person', 
'primary': False, 
'classification': 'coauthor'}

The original if clause on line 70 of bill_openstates_fetch.py did not properly handle cases like the second sponsor object example shown above.

jessiclassy commented 1 month ago

After pushing the bug fix and deploying it to the server, I'll have to monitor the server for the next few hours while the batch updates are processed. This will be standard practice if/when database updates are interrupted in the future.

Source: tested the fix on ca_test schema and monitored fetching progress over the past few hours.

jessiclassy commented 1 month ago

Re-running these changes on ca_dev did not update the schema as expected - it reveals that any breakage in snapshot updates blocks updates to the front-end tables.