(<textVersions>): formats missing in some legislation #232

Open ryparker opened 4 months ago

ryparker commented 4 months ago

Some bill-status return <textVersion> items that have empty <formats> tags, however this data is available on the API and the actual format files (pdfs,xml,html) are available from GPO.

Example request:

curl --location ''


<?xml version="1.0" encoding="utf-8" standalone="no"?>
            <![CDATA[<pre>[Congressional Record Volume 169, Number 84 (Thursday, May 18, 2023)][House]From the Congressional Record Online through the Government Publishing Office [<a href=""></a>]By Ms. HAGEMAN:H.R. 3496.Congress has the power to enact this legislation pursuantto the following:Article I Section 8The single subject of this legislation is:To make technical amendments to update statutory referencesto certain provisions which were formerly classified tochapters 14 and 19 of title 25, United States Code, and tocorrect related technical errors[Page H2458]</pre>]]>
                <name>Judiciary Committee</name>
                        <name>Referred to</name>
                <name>Judiciary Committee</name>
                        <name>Reported by</name>
                        <name>Markup by</name>
                        <name>Referred to</name>
                <citation>H. Rept. 118-235</citation>
                <title>SUPPORT for Patients and Communities Reauthorization Act</title>
                    <text>Placed on Senate Legislative Calendar under General Orders. Calendar No. 319.</text>
                        <type>Related bill</type>
                        <name>Judiciary Committee</name>
                <text>Received in the Senate and Read twice and referred to the Committee on the Judiciary.</text>
                <text>Motion to reconsider laid on the table Agreed to without objection.</text>
                    <name>House floor actions</name>
                <text>On motion to suspend the rules and pass the bill Agreed to by voice vote. (text: CR H5679-5687)</text>
                    <name>House floor actions</name>
                <text>Passed/agreed to in House: On motion to suspend the rules and pass the bill Agreed to by voice vote.(text: CR H5679-5687)</text>
                    <name>Library of Congress</name>
                <text>DEBATE - The House proceeded with forty minutes of debate on H.R. 3496.</text>
                    <name>House floor actions</name>
                <text>Considered under suspension of the rules. (consideration: CR H5679-5688)</text>
                    <name>House floor actions</name>
                <text>Mr. Cline moved to suspend the rules and pass the bill.</text>
                    <name>House floor actions</name>
                <text>Placed on the House Calendar, Calendar No. 43.</text>
                    <name>House floor actions</name>
                <text>Reported by the Committee on Judiciary. H. Rept. 118-235.</text>
                    <name>House floor actions</name>
                        <name>Judiciary Committee</name>
                <text>Reported by the Committee on Judiciary. H. Rept. 118-235.</text>
                    <name>Library of Congress</name>
                        <name>Judiciary Committee</name>
                        <name>Judiciary Committee</name>
                    <name>House committee actions</name>
                <text>Ordered to be Reported by Voice Vote.</text>
                        <name>Judiciary Committee</name>
                    <name>House committee actions</name>
                <text>Committee Consideration and Mark-up Session Held.</text>
                <text>Referred to the House Committee on the Judiciary.</text>
                    <name>House floor actions</name>
                        <name>Judiciary Committee</name>
                <text>Introduced in House</text>
                    <name>Library of Congress</name>
                <text>Introduced in House</text>
                    <name>Library of Congress</name>
                <fullName>Rep. Hageman, Harriet M. [R-WY-At Large]</fullName>
                <title>H.R. 3496, a bill to make technical amendments to update statutory references to certain provisions which were formerly classified to chapters 14 and 19 of title 25, United States Code, and to correct related technical errors</title>
                <description>As ordered reported by the House Committee on the Judiciary onMay 24, 2023</description>
            <name>Native Americans</name>
                    <name>Congressional operations and organization</name>
                    <name>Indian claims</name>
                    <name>Indian lands and resources rights</name>
                <name>Native Americans</name>
                <actionDesc>Introduced in House</actionDesc>
                    <![CDATA[ <p>This bill updates references in the <em>U.S. Code</em> to certain provisions in Title 25 (Indians). In 2016, Congress transferred certain provisions in Chapter 14 (Miscellaneous) and Chapter 19 (Indian Land Claims Settlements) of Title 25 to new chapters at the end of the title as part of an effort to reclassify the code. To reflect the reclassification of the code, this bill updates references to Title 25.</p>]]>
                <actionDesc>Reported to House</actionDesc>
                    <![CDATA[ <p>This bill updates references in the <em>U.S. Code</em> to certain provisions in Title 25 (Indians). In 2016, Congress transferred certain provisions in Chapter 14 (Miscellaneous) and Chapter 19 (Indian Land Claims Settlements) of Title 25 to new chapters at the end of the title as part of an effort to reclassify the code. To reflect the reclassification of the code, this bill updates references to Title 25.</p>]]>
                <actionDesc>Passed House</actionDesc>
                    <![CDATA[ <p>This bill updates references in the <em>U.S. Code</em> to certain provisions in Title 25 (Indians). In 2016, Congress transferred certain provisions in Chapter 14 (Miscellaneous) and Chapter 19 (Indian Land Claims Settlements) of Title 25 to new chapters at the end of the title as part of an effort to reclassify the code. To reflect the reclassification of the code, this bill updates references to Title 25.</p>]]>
        <title>To make technical amendments to update statutory references to certain provisions which were formerly classified to chapters 14 and 19 of title 25, United States Code, and to correct related technical errors.</title>
                <titleType>Display Title</titleType>
                <title>To make technical amendments to update statutory references to certain provisions which were formerly classified to chapters 14 and 19 of title 25, United States Code, and to correct related technical errors.</title>
                <titleType>Official Title as Introduced</titleType>
                <title>To make technical amendments to update statutory references to certain provisions which were formerly classified to chapters 14 and 19 of title 25, United States Code, and to correct related technical errors.</title>
                <billTextVersionName>Introduced in House</billTextVersionName>
                <type>Referred in Senate</type>
                <type>Engrossed in House</type>
                <type>Reported in House</type>
                <type>Introduced in House</type>
            <text>Received in the Senate and Read twice and referred to the Committee on the Judiciary.</text>
    <dublinCore xmlns:dc="">
        <dc:rights>Pursuant to Title 17 Section 105 of the United States Code, this file is not subject to copyright protection and is in the public domain.</dc:rights>
        <dc:contributor>Congressional Research Service, Library of Congress</dc:contributor>
        <dc:description>This file contains bill summaries and statuses for federal legislation. A bill summary describes the most significant provisions of a piece of legislation and details the effects the legislative text may have on current law and federal programs. Bill summaries are authored by the Congressional Research Service (CRS) of the Library of Congress. As stated in Public Law 91-510 (2 USC 166 (d)(6)), one of the duties of CRS is "to prepare summaries and digests of bills and resolutions of a public general nature introduced in the Senate or House of Representatives". For more information, refer to the User Guide that accompanies this file.</dc:description>

Notice how <textVersions> has empty <formats/> tags:

                <type>Referred in Senate</type>
                <type>Engrossed in House</type>
                <type>Reported in House</type>
                <type>Introduced in House</type>

However if you check's API you'll see the formats are available:


curl --location '<API_KEY>' 


    "pagination": {
        "count": 4
    "request": {
        "billNumber": "3496",
        "billType": "hr",
        "billUrl": "",
        "congress": "118",
        "contentType": "application/json",
        "format": "json"
    "textVersions": [
            "date": "2023-11-14T05:00:00Z",
            "formats": [
                    "type": "Formatted Text",
                    "url": ""
                    "type": "PDF",
                    "url": ""
            "type": "Referred in Senate"
            "date": "2023-11-13T05:00:00Z",
            "formats": [
                    "type": "Formatted Text",
                    "url": ""
                    "type": "PDF",
                    "url": ""
            "type": "Engrossed in House"
            "date": "2023-09-29T04:00:00Z",
            "formats": [
                    "type": "Formatted Text",
                    "url": ""
                    "type": "PDF",
                    "url": ""
            "type": "Reported in House"
            "date": "2023-05-18T04:00:00Z",
            "formats": [
                    "type": "Formatted Text",
                    "url": ""
                    "type": "PDF",
                    "url": ""
            "type": "Introduced in House"

And by assuming the GPO url you'll notice that the formats are available:

curl --location ''

I'll provide a list of the affected legislation here:

118th congress

jonquandt commented 4 months ago

Good afternoon,

The BILLSTATUS xml will only return textVersion format information if the xml is included. In the example you cited above, there is no xml available. We do not currently provide links to the pdf or html within the BILLSTATUS xml on the bulkdata repository.

That being said, you may be interested in using our related service to find the related bills and download any relevant information. For example, returns a list of relationships - in this case, Congressional Bills (BILLS), History of Bills (HOB), and Congressional Reports (CRPT)

    "relationships": [
            "relationshipLink": "",
            "collection": "BILLS",
            "relationship": "Bill versions"
            "relationshipLink": "",
            "collection": "HOB",
            "relationship": "Bill History"
            "relationshipLink": "",
            "collection": "CRPT",
            "relationship": "Congressional Reports"
    "relatedId": "BILLSTATUS-118hr3496"

if you follow the BILLS relationship, you will see a list of different bill versions related to this BILLSTATUS:

    "results": [
            "dateIssued": "2023-05-18",
            "billVersion": "ih",
            "packageId": "BILLS-118hr3496ih",
            "packageLink": "",
            "billVersionLabel": "Introduced in House",
            "lastModified": "2023-08-07T13:09:13Z"
            "dateIssued": "2023-09-29",
            "billVersion": "rh",
            "packageId": "BILLS-118hr3496rh",
            "packageLink": "",
            "billVersionLabel": "Reported in House",
            "lastModified": "2023-11-14T05:54:43Z"
            "dateIssued": "2023-11-13",
            "billVersion": "eh",
            "packageId": "BILLS-118hr3496eh",
            "packageLink": "",
            "billVersionLabel": "Engrossed in House",
            "lastModified": "2023-11-15T04:33:15Z"
            "dateIssued": "2023-11-14",
            "billVersion": "rfs",
            "packageId": "BILLS-118hr3496rfs",
            "packageLink": "",
            "billVersionLabel": "Referred in Senate",
            "lastModified": "2023-11-16T01:54:49Z"
    "relatedId": "BILLSTATUS-118hr3496"

Following any of these will return a summary of information about this content, including download links, like BILLS-118hr3496rfs

    "originChamber": "HOUSE",
    "references": [**COLLAPSED FOR READABILITY**],
    "congress": "118",
    "session": "1",
    "detailsLink": "",
    "isPrivate": "false",
    "title": "An act h.R. 3496 (RFS) - Referred in Senate",
    "branch": "legislative",
    "isAppropriation": "false",
    "collectionName": "Congressional Bills",
    "download": {
        "premisLink": "",
        "txtLink": "",
        "zipLink": "",
        "modsLink": "",
        "pdfLink": ""
    "pages": "37",
    "related": {"billStatusLink": ""},
    "relatedLink": "",
    "suDocClassNumber": "Y 1.6:, Y 1.4/6:",
    "dateIssued": "2023-11-14",
    "currentChamber": "SENATE",
    "billVersion": "rfs",
    "billType": "hr",
    "packageId": "BILLS-118hr3496rfs",
    "committees": [{
        "authorityId": "ssju00",
        "chamber": "S",
        "committeeName": "Committee on the Judiciary",
        "type": "S"
    "collectionCode": "BILLS",
    "governmentAuthor2": "Senate",
    "governmentAuthor1": "Congress",
    "publisher": "U.S. Government Publishing Office",
    "docClass": "hr",
    "lastModified": "2023-11-16T01:54:49Z",
    "category": "Bills and Statutes",
    "billNumber": "3496",
    "otherIdentifier": {
        "migrated-doc-id": "f:h3496rfs.txt",
        "parent-ils-system-id": "000501532",
        "child-ils-title": "House bills",
        "parent-ils-title": "Congressional bills",
        "child-ils-system-id": "000325573",
        "stock-number": "021-610-00252-9"
ryparker commented 3 months ago

Thanks for the workaround Jon.

Does it make sense to keep this open as a feature request? It would be nice have the non-xml format URLs included in the BILLSTATUS response.