usgpo / bulk-data

User Guides for XML on the govinfo Bulk Data Repository. For information about Bill Status XML Bulk Data, see https://github.com/usgpo/bill-status.
https://www.govinfo.gov/bulkdata
262 stars 97 forks source link

API error calling REST .. can't whitelist since I am using lambda.. Can we get an API key? #148

Closed rchancey closed 4 months ago

rchancey commented 5 months ago

@jonquandt . or whoever is monitoring this.

I am calling the API via HTTP REST using lambda.. and getting the following:

I am unable to white list since I am using AWS Cloud/ Lambda.

I can't find a way to get an API key for this site..

https://www.ecfr.gov/developers/documentation/api/v1#/Versioner%20Service/get_api_versioner_v1_full__date__title__title__xml

calling https://www.ecfr.gov/admin/api/admin/v1/corrections.json?

Is there a way I can get an API key so I can call this please?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">

Federal Register :: Request Access

Request Access

Request Access

Due to aggressive automated scraping of FederalRegister.gov and eCFR.gov, programmatic access to these sites is limited to access to our extensive developer APIs.

If you are human user receiving this message, we can add your IP address to a set of IPs that can access FederalRegister.gov & eCFR.gov; complete the CAPTCHA (bot test) below and click "Request Access". This process will be necessary for each IP address you wish to access the site from, requests are valid for approximately one quarter (three months) after which the process may need to be repeated.

An official website of the United States government.

If you want to request a wider IP range, first request access for your current IP, and then use the "Site Feedback" button found in the lower left-hand side to make the request.

jonquandt commented 5 months ago

Good afternoon,

I will pass this along to the team that supports eCFR.gov. That being said, there is no API key needed to access the ecfr.gov API. I see what you're saying about not being able to provide a list of IPs or IP ranges to support this as it is currently setup.

rchancey commented 5 months ago

Thank you @jonquandt ! that would be very helpful. Serverless cloud computing is certainly a thing and would be nice to be able to call the APIs .. appreciate your following up with them. Is there a way I can send them a request directly? Like how to get images and such?

jonquandt commented 5 months ago

I would suggest clicking on the Site Feedback button at the bottom right of https://www.ecfr.gov/reader-aids/ecfr-developer-resources/rest-api-interactive-documentation

and provide them with some additional information -- may help to include links to this and your high-quality image issue directly.

Be sure to check the I am requesting technical help or providing website feedback option

rchancey commented 5 months ago

Ok. I did that and let's see what they say.

What is this git repository for? I also saw you had created a test repo for calling some API and I tried that and it's also down.

jonquandt commented 5 months ago

Ok. I did that and let's see what they say.

What is this git repository for? I also saw you had created a test repo for calling some API and I tried that and it's also down.

This is for the GovInfo Bulk data repository. While GovInfo bulkdata eCFR content makes its way to ecfr.gov, it isn't really the right place to request support for the ecfr.gov API.

rchancey commented 5 months ago

Errr.. what's the difference? Dont' the eCFR api just offer up the bulk data? I am downloading the full xml titles.. from the link here: https://www.govinfo.gov/bulkdata/ECFR It's awesome!! (other than images). Is that what this repo is for?

that bulk data link has been a lifesaver. The only thing missing is changes .. that's why the API is important because I can't figure what other way to get changes.

jonquandt commented 5 months ago

The bulkdata site is operated by the GovInfo team and the ecfr.gov API is operated by a different team. The ecfr.gov site takes the ECFR xml from GovInfo (technically via our GovInfo API) and then takes some additional actions to enhance display and functionality, including the change tracking/diffing that you mention.

GovInfo is solely a GPO system with content from all three branches of the Federal Government, while the ecfr.gov site and API is separately maintained by NARA's Office of the Federal Register in partnership with GPO. They have different areas of focus, which is why the ecfr.gov API has specific functionality tailored to the needs of ECFR users.

rchancey commented 5 months ago

So I am currently using the xml from here: https://www.govinfo.gov/bulkdata/ECFR/title-50 I just open a url and download the xml. Works great. Do you have an API that provides this that follows the same schema as the xml?

what about changes and images.. how do you get those? :) your site is great btw.. its been a lifesaver. Well done on that

jonquandt commented 5 months ago

for the content itself, you could use the following GovInfo API calls:

https://api.govinfo.gov/collections/ECFR/2024-02-01T00:00:00Z?offsetMark=*&pageSize=50&api_key=DEMO_KEY to get a list of all ECFR titles that have been updated since the time specified (in this case since the beginning of February).

Then follow individual packageLinks to get additional information as well as the specific urls for download. For ECFR, the xmlLink urls are pointed directly to the Bulkdata site xml.

Example package summary (which also includes a link to the relevant, lower-quality graphics): https://api.govinfo.gov/packages/ECFR-title40/summary?api_key=DEMO_KEY

GovInfo doesn't track the changes to individual ECFR titles -- the ECFR.gov system provides that point in time functionality.

I believe that the ecfr.gov API may provide programmatic access to higher-quality images, but the ecfr.gov team would be better equipped to help you with that.

Probably better to get all of the information from ecfr.gov - they have endpoints that allow downloads of full xml as well, at the title or lower levels.

rchancey commented 5 months ago

ok.. I did send a feedback request so let's see if they respond. Thank you @jonquandt , I will let you know what they say so others can be aware as well.

rchancey commented 5 months ago

@jonquandt I am confused on something: https://www.govinfo.gov/bulkdata/ECFR https://www.ecfr.gov/

The bulk data site has ECFR in the suffix.. where I am getting the full xml The eCFR site has ecfr in the url and also has bulk download..

so I assume while you have ecfr in the url you are not the ecfr team right?

I am also trying to determine where the ecfr team gets their changes from.

jonquandt commented 5 months ago

The eCFR.gov team supports the https://www.ecfr.gov domain.

The GovInfo team supports https://www.govinfo.gov (and the api.govinfo.gov service)

eCFR.gov pulls the latest eCFR xml from the GovInfo API. They store each version of a particular title and then provide point in time/change tracking on their system essentially by diffing each version of a given title.

rchancey commented 5 months ago

@jonquandt sorry to bug you again in this thread.. but do you happen to know of a source for XML For the corps of engineer manuals?

We are looking for Safety and Health Requirements EM 385-1-1

we can only find it in pdf.. but as you can imagine that's pretty horrible to parse and work with

Any ideas or friends in that organization that you may know that have that in xml? Ray

jonquandt commented 5 months ago

@jonquandt sorry to bug you again in this thread.. but do you happen to know of a source for XML For the corps of engineer manuals?

We are looking for Safety and Health Requirements EM 385-1-1

we can only find it in pdf.. but as you can imagine that's pretty horrible to parse and work with

Any ideas or friends in that organization that you may know that have that in xml? Ray

Sorry, I do not have any contacts in that org. I would suggest writing to the webmaster on that site, who may be able to direct your inquiry. There is an official app for that manual on both Android and iOS that looks like it has some form of structured data behind it.

rchancey commented 5 months ago

@jonquandt Hi ya man.. hope you are well? how do these changes work? The last change date seems to be in the future:

Issue date would seemingly be the original and the lastModified would be the last modification.. which according to the eCFR is Displaying title 1, up to date as of 2/27/2024. Title 1 was last amended 12/29/2022.

unless its actually being modified in the middle of this query?

"packageId": "ECFR-title40", "lastModified": "2024-02-29T00:30:55Z", "packageLink": "https://api.govinfo.gov/packages/ECFR-title40/summary", "docClass": "ECFR", "title": "Protection of Environment", "congress": null, "dateIssued": "2024-02-27" }, { "packageId": "ECFR-title12", "lastModified": "2024-02-29T00:04:59Z", "packageLink": "https://api.govinfo.gov/packages/ECFR-title12/summary", "docClass": "ECFR", "title": "Banks and Banking", "congress": null, "dateIssued": "2024-02-27" }, { "packageId": "ECFR-title42", "lastModified": "2024-02-29T00:04:38Z", "packageLink": "https://api.govinfo.gov/packages/ECFR-title42/summary", "docClass": "ECFR", "title": "Public Health", "congress": null, "dateIssued": "2024-02-27" }, {

jonquandt commented 5 months ago

@rchancey - All of the lastModified times are in UTC. So, 2024-02-29T00:04:38Z in UTC is 2024-02-28T19:04:38 EST (~7pm)

ETA: Also, the dateIssued is effectively the "currency" information, showing how up to date the regulations in the eCFR are -- lastModified indicates the last time it was published on GovInfo.