unitedstates / BillMap

Utilities and applications for the FlatGov project by Demand Progress
Other
14 stars 2 forks source link

Scrape data for CBO report and relate to bill #52

Closed aih closed 3 years ago

aih commented 3 years ago

The data is already available in cboCostEstimates in the data.json. We want to get the data for a bill and for related bills. We should add that to our table.

May scrape from cbo.gov: https://www.cbo.gov/cost-estimates
. The CBO data is also on congress.gov. Would be great to relate to a logical routing pattern, e.g.

116/hr1

R Street may have a process to do this.

aih commented 3 years ago

https://www.govinfo.gov/bulkdata

See e.g. 
https://www.cbo.gov/publication/56686

The 'legislative information' link connects to the congress.gov page. And congress.gov shows the CBO estimate.

aih commented 3 years ago

Information about the CBO estimates is in the metadata that we scrape with the unitedstates/congress scraper. This information is in the data directory of the project, which we do not store here in git.

We describe the directory structure here: https://github.com/aih/FlatGov/blob/main/CELERY.adoc#scraping-bill-text-and-metadata

For each bill, the fdsys_billstatus.xml file has information about the cbo score in an element: billStatus>bill>cboCostEstimates:

<cboCostEstimates>
      <item>
        <pubDate>2019-03-01T15:56:00Z</pubDate>
        <title>H.R. 1, For the People Act of 2019</title>
        <url>https://www.cbo.gov/publication/55003</url>
      </item>
      <item>
        <pubDate>2019-03-07T23:01:00Z</pubDate>
        <title>H.R. 1, For the People Act of 2019</title>
        <url>https://www.cbo.gov/publication/55021</url>
      </item>
    </cboCostEstimates>

image