usgpo / api

services to access govinfo content and metadata
https://api.govinfo.gov
Other
184 stars 58 forks source link

ideas #164

Closed sergio-rivas closed 1 week ago

sergio-rivas commented 1 month ago

Request: Add an endpoint for bill actions and text versions, for a congress

When trying to backfill data locally for analysis, the only way to get bill actions or text versions is the through the endpoint for a specific bill. This ends up limiting the response to only ~10 actions per request, while technically the API can support up to 250 items per request. This could be implemented similarly to the Summaries api

Screen Shot 2024-10-16 at 12 16 25 AM

In order to run a macro analysis, It is important to get such granular data. For example, to generate a chart of every bills path in their lifecycle, it is important to have the full list of actions for a bill. In order to analyze the approximate percentage change per text version, it is important to have the structured urls for all the text versions for a bill.

jonquandt commented 1 month ago

@sergio-rivas - based on the screenshot, your question appears to be for the congress.gov team. You may want to create an issue at https://github.com/LibraryOfCongress/api.congress.gov/issues

That being said, you could get much of what you're looking for using the GovInfo API (documentation) and using the collections or search endpoints.

example collections request:

https://api.govinfo.gov/collections/BILLS/2024-10-01T00:00:00Z?offsetMark=*&pageSize=1000&api_key=DEMO_KEY This shows bill texts that have been added or changed since 10/1/2024

By following the packageLink in each response, you can see the get text content via the download.txtLink or download.xmlLink, or you can get a list of related bill versions using the relatedLink and pull the BILLSTATUS XML via the related.billstatusLink, which as it aggregates information from the congress.gov API bill endpoints to create a single xml file.

Here's a sample bill from the 118th congress: https://api.govinfo.gov/packages/BILLS-118s1549enr/summary?api_key=DEMO_KEY

There are a number of links that may be of use from the following keys:

    "download": {
        "premisLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/premis",
        "xmlLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/xml",
        "txtLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/htm",
        "zipLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/zip",
        "modsLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/mods",
        "pdfLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/pdf",
        "uslmLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/uslm"
    },
    "pages": "1",
    "related": {"billStatusLink": "https://api.govinfo.gov/packages/BILLSTATUS-118s1549/xml"},
    "relatedLink": "https://api.govinfo.gov/related/BILLS-118s1549enr",

Search request

Here's an example search request that you could use as a starting point to iterate through and grab all of the text or other information you would like related to Congressional bills.

curl -X 'POST' \
  'https://api.govinfo.gov/search?api_key=DEMO_KEY' \
  -H 'accept: application/json' \
  -H 'Content-Type: application/json' \
  -d '{
  "query": "collection:bills congress:118",
  "pageSize": 100,
  "offsetMark": "*",
  "sorts": [
    {
      "field": "relevancy",
      "sortOrder": "DESC"
    }
  ],
  "historical": true,
  "resultLevel": "default"
}'

This will return a set of results that look something like this:

{
  "results": [
    {
      "title": "Recognizing October 15, 2024, as the day to honor the diaspora of Hispanic culture, and the representation of Hispanics in the legal profession and the judiciary.",
      "packageId": "BILLS-118hres1545ih",
      "granuleId": null,
      "lastModified": "2024-10-16T03:27:52Z",
      "governmentAuthor": [
        "Congress",
        "House of Representatives"
      ],
      "dateIssued": "2024-10-15",
      "collectionCode": "BILLS",
      "resultLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/summary",
      "dateIngested": "2024-10-15",
      "download": {
        "premisLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/premis",
        "xmlLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/xml",
        "txtLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/htm",
        "zipLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/zip",
        "modsLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/mods",
        "pdfLink": "https://api.govinfo.gov/packages/BILLS-118hres1545ih/pdf"
      },
      "relatedLink": "https://api.govinfo.gov/related/BILLS-118hres1545ih"
    },

As you can see, you can directly access the content files or the related service from the search service call. The search service can return up to 1000 results per call - you can iterate to the next set by updating the curl request to include the offsetMark provided in the response.

Related service

If you call the related service for a bill that has multiple versions, you will see something like this: https://api.govinfo.gov/related/BILLS-118s1549enr/BILLS?api_key=DEMO_KEY

{
    "results": [
        {
            "dateIssued": "2023-05-10",
            "billVersion": "is",
            "packageId": "BILLS-118s1549is",
            "packageLink": "https://api.govinfo.gov/packages/BILLS-118s1549is/summary",
            "billVersionLabel": "Introduced in Senate",
            "lastModified": "2024-06-06T19:51:32Z"
        },
        {
            "dateIssued": "2023-06-13",
            "billVersion": "rs",
            "packageId": "BILLS-118s1549rs",
            "packageLink": "https://api.govinfo.gov/packages/BILLS-118s1549rs/summary",
            "billVersionLabel": "Reported in Senate",
            "lastModified": "2024-06-06T19:35:16Z"
        },
        {
            "dateIssued": "2023-06-22",
            "billVersion": "es",
            "packageId": "BILLS-118s1549es",
            "packageLink": "https://api.govinfo.gov/packages/BILLS-118s1549es/summary",
            "billVersionLabel": "Engrossed in Senate",
            "lastModified": "2024-06-06T19:22:54Z"
        },
        {
            "dateIssued": "2024-09-25",
            "billVersion": "enr",
            "packageId": "BILLS-118s1549enr",
            "packageLink": "https://api.govinfo.gov/packages/BILLS-118s1549enr/summary",
            "billVersionLabel": "Enrolled Bill",
            "lastModified": "2024-09-25T02:37:43Z"
        }
    ],
    "relatedId": "BILLS-118s1549enr"
}

This will again provide you with access to the content via the packageLink, and the dateIssued could be useful for your

For example, to generate a chart of every bills path in their lifecycle, it is important to have the full list of actions for a bill.

comment.

Note that the bill status information provided by Congress.gov (or via the BILLSTATUS xml in the GovInfo API) will not show text versions that do not exist in GovInfo (or Congress.gov, which pulls bill text from GovInfo). There are sometimes delays between a text version being created within Congress and official dissemination via GovInfo due to workload and other factors.

I hope this is helpful. Let me know if there's something I can clarify.