usgpo / bill-status

Information about Bill Status XML Bulk Data including the XML User Guide.
https://www.govinfo.gov/bulkdata/BILLSTATUS
152 stars 64 forks source link

Upcoming Adjustment to Action/Committee Element #147

Closed jonquandt closed 4 years ago

jonquandt commented 4 years ago

Due to a request from Legislative stakeholders, BILLSTATUS files will soon have an updated element for Committees associated with actions. Key changes include:

For BILLSTATUS-115s2979:

<item>
                <actionDate>2018-05-24</actionDate>
                <committees>
                                <item>
                                                <systemCode>ssap00</systemCode>
                                                <name>Appropriations Committee</name>
                                </item>
                                <item>
                                                <systemCode>ssbu00</systemCode>
                                                <name>Budget Committee</name>
                                </item>
                </committees>
                <links/>
                <sourceSystem>
                                <code>0</code>
                                <name>Senate</name>
                </sourceSystem>
                <text>

Read twice and referred concurrently to the Committees on Appropriations; the Budget pursuant to the order of January 30, 1975, as modified by the order of April 11, 1986, with instructions that the Budget Committee be authorized to report its views to the Appropriations Committee, and that the latter alone be authorized to report the bill.
                </text>
                <type>IntroReferral</type>
</item>

Note: due to differences in the data provided and requirements from stakeholders, House actions will continue to have a single committee assigned per action – this can result in a multiple actions with the same text, but assigned to different committees. For an existing example, see BILLSTATUS-116hr796 for the actionCode H11100 dated 2019-01-25.

This will likely result in a need to update scripts that parse for committee information. We anticipate this change to be in place in mid-January 2020. A more firm timeline will be provided in the first week of January.

Once this change is in effect for day-forward BILLSTATUS files, GPO plans to reprocess all existing BILLSTATUS files in the bulkdata repository, beginning with the 116th Congress and moving backwards to the 113th Congress.

There are sample files located here. Additional samples can be provided if needed.

This issue will be closed once the change is live and all existing packages have been reprocessed.

Status of change

Reprocessing Status:

JoshData commented 4 years ago

It looks like this change is already deployed. Starting on Dec. 13 I started getting 689 bills (in the 116th Congress) with "committees" instead of "committee" in actions. (Example: HR 3) It looks like currently 1,897 bills have this change. Folks using the congress project tools are tracking this issue at https://github.com/unitedstates/congress/issues/245.

jonquandt commented 4 years ago

@JoshData - we’re looking into why this configuration was deployed early. We will reset this and reprocess the affected packages. Apologies for the inconvenience!

JoshData commented 4 years ago

Thanks!

jonquandt commented 4 years ago

@JoshData - we have implemented a fix to prevent the <committees> element from appearing again and are in the process or reprocessing the affected packages -- to be safe, we are reprocessing everything between 12/10/2019 and this evening.

Note: new BILLSTATUS files since ~4:45pm should have the current correct <committee> element under actions

I will provide an updated status when the reprocessing is complete via a comment on this issue -- I can also put a comment on your linked issue if you like.

jonquandt commented 4 years ago

All affected billstatus packages were reprocessed last week.

jonquandt commented 4 years ago

As an update, we are now planning to roll this out in February - likely next week. I will provide an update when we have a firm date.

JoshData commented 4 years ago

Thanks for the update!

jonquandt commented 4 years ago

This will be going live tomorrow around lunchtime Eastern. I will provide another update with the specific date-time for the cutoff of new and updated BILLSTATUS that will have the new committees element.

Note: This will likely cause a slight delay in the afternoon's processing.

jonquandt commented 4 years ago

As of 2020-02-05T16:48:27Z, all BILLSTATUS packages processed will include the committees element under actions.

https://api.govinfo.gov/collections/BILLSTATUS/2020-02-05T16:48:27Z?pageSize=100&offset=0&api_key=DEMO_KEY

We will be performing reprocessing of the 116th Congress over the weekend to ensure that they are fully up-to-date while minimizing impact for items that are currently being updated.

We will also go back and do remaining congresses once we complete the 116th.

cc @JoshData