usgpo / bill-status

Information about Bill Status XML Bulk Data including the XML User Guide.
https://www.govinfo.gov/bulkdata/BILLSTATUS
158 stars 47 forks source link

Bill Status actions questions #6

Closed dwillis closed 8 years ago

dwillis commented 8 years ago

Bill status files like this one contain nearly duplicative actions. For example:

screen shot 2016-04-21 at 4 49 48 pm

The differences I can see occur in actionCode and sometimes in sourceSystem. Should we expect that a bill might have the same action, but with different actionCodes? Which one is considered canonical? Is one sourceSystem preferable to another?

Thanks!

JoshData commented 8 years ago

Relatedly, what logic does Congress.gov use to squash some redundant actions? Can we get a peak at the source code that handles that?

wcarter commented 8 years ago

Any further information on redundant actions? Did this get answered somewhere else and I missed it?

104PL104 commented 8 years ago

Yes, expect a bill to have the same action but with different actionCodes. Action Codes listed at https://www.congress.gov/help/field-values/action-codes, in general, condense detailed legislative action steps/"primary source data."

When House or Senate are the sourceSystem, understand those to be "primary source data." When LOC is the sourceSystem, consider that to be a data enhancement - intended to help provide meaningful context and supplement "primary source data."

Various code sets are used by multiple systems in the House, Senate, and Library of Congress by legislative clerks and data editors for functions independent of this bulk data set. As new codes and systems have been developed (since the early 1970s), there was no coordinated effort to retroactively apply new codes to old records. Many codes are concatenated with other codes or elements or utilize free text. Codes in one set may be redundant with a different code in another code set. Additionally, some codes may have been used and re-used over the years for different purposes further complicating the ability to create an authoritative list.

The Actions tab view options and Latest Action in the overview (see https://www.congress.gov/bill/114th-congress/house-bill/26/actions) show 4 different volumes of action information. Congress.gov serves multiple audiences

MokeEire commented 2 years ago

There also appears to be a set of action codes that do not correspond to any codes listed at https://www.congress.gov/help/field-values/action-codes e.g. Intro-H, H11100.

How would you suggest someone parses the distinct actions taken on a bill?

104PL104 commented 2 years ago

My suggestion is to focus on Action Codes, which are a controlled vocabulary. Action Codes were created to help machines, and people, have a consistent method to use for data from 1973-present. As we slowly move deeper into historical data we will use a subset of the same Action Codes.

Action Codes identify stages that condense detailed legislative action steps.

Actions that do not have Action Codes should not be expected to be consistently used nor sufficiently unique.

MokeEire commented 2 years ago

What about action codes not listed in the table you linked? Should they be expected to be consistently used?

Of the top 10 action codes used in the 117th congress' data, only two are listed in that table.

image