unitedstates / congress

Public domain data collectors for the work of Congress, including legislation, amendments, and votes.
https://github.com/unitedstates/congress/wiki
Creative Commons Zero v1.0 Universal
930 stars 202 forks source link

Errors when parsing amendments for 118th Congress #299

Closed achokey-crp closed 2 months ago

achokey-crp commented 1 year ago

I ran the script for the 118th Congress and some KeyErrors popped up while processing amendments. Could there be a new schema?

Here's a snippet of the error output where activities, description, and proposedDate weren't recognized.

[33mcongress-library_1  | [s4003-117] Exception:
congress-library_1  | 
congress-library_1  | Traceback (most recent call last):
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/utils.py", line 174, in process_set
congress-library_1  |     results = fetch_func(id, options, *extra_args)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bills.py", line 148, in process_bill
congress-library_1  |     process_amendments(bill_id, xml_as_dict, options)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bills.py", line 292, in process_amendments
congress-library_1  |     amendment_info.process_amendment(amdt, bill_id, options)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/amendment_info.py", line 25, in process_amendment
congress-library_1  |     xml_file.write(create_govtrack_xml(amdt, options))
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/amendment_info.py", line 116, in create_govtrack_xml
congress-library_1  |     make_node(root, "description", amdt["description"] if amdt["description"] else amdt["purpose"])
congress-library_1  | 
congress-library_1  | KeyError: 'description'
congress-library_1  | 
congress-library_1  | 
congress-library_1  | [s4008-117] Exception:
congress-library_1  | 
congress-library_1  | Traceback (most recent call last):
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/utils.py", line 174, in process_set
congress-library_1  |     results = fetch_func(id, options, *extra_args)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bills.py", line 148, in process_bill
congress-library_1  |     process_amendments(bill_id, xml_as_dict, options)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bills.py", line 292, in process_amendments
congress-library_1  |     amendment_info.process_amendment(amdt, bill_id, options)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/amendment_info.py", line 13, in process_amendment
congress-library_1  |     amdt = build_amendment_json_dict(amdt_data, options)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/amendment_info.py", line 72, in build_amendment_json_dict
congress-library_1  |     amdt['proposed_at'] = amdt_dict['proposedDate']
congress-library_1  | 
congress-library_1  | KeyError: 'proposedDate'
congress-library_1  | 
congress-library_1  | 
congress-library_1  | [s4065-117] Exception:
congress-library_1  | 
congress-library_1  | Traceback (most recent call last):
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/utils.py", line 174, in process_set
congress-library_1  |     results = fetch_func(id, options, *extra_args)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bills.py", line 127, in process_bill
congress-library_1  |     bill_data = form_bill_json_dict(xml_as_dict)
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bills.py", line 257, in form_bill_json_dict
congress-library_1  |     'committees': bill_info.committees_for(billCommittees),
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bill_info.py", line 267, in committees_for
congress-library_1  |     return sum([build_dict(committee) for committee in committee_list], [])
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bill_info.py", line 267, in <listcomp>
congress-library_1  |     return sum([build_dict(committee) for committee in committee_list], [])
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bill_info.py", line 249, in build_dict
congress-library_1  |     'activity': get_activitiy_list(item),
congress-library_1  | 
congress-library_1  |   File "/usr/local/lib/python3.8/site-packages/congress/tasks/bill_info.py", line 237, in get_activitiy_list
congress-library_1  |     if not item['activities']:
congress-library_1  | 
congress-library_1  | KeyError: 'activities'