sunlightlabs / regulations-scraper

Scraper of public comments on regulations.gov
BSD 3-Clause "New" or "Revised" License
24 stars 4 forks source link

need to scrape comments from regulations.gov using API #1

Open dtgossi opened 5 years ago

dtgossi commented 5 years ago

Hi there.

New to Github. Need to figure out how to write code to scrape comments off of regulations.gov with the new API, and then off of multiple web pages (though I think that is implied). Experienced researcher, new to coding though. Need help.

This, specifically, is the link to the comments I'm working with.

D

dtgossi commented 5 years ago

I just emailed regulations@erulemakinghelpdesk.com in order to get the API key. We'll see what they say.

D

apendleton commented 5 years ago

@dtgossi I maintained this project when I worked for Sunlight, but I haven't worked there in several years and nobody else took it over, so unfortunately I'm not sure anyone paying attention to this repo is going to be able to be of much assistance. I no longer have an API key, and the database that we used this scraper to populate is long gone at this point. Good luck though!

willjobs commented 3 years ago

Hey @apendleton, you mentioned that the database you used the scraper to populate is long gone. Did it get deleted? Or is it just a matter of finding the right person to contact? If the latter, is there any chance you could point me in the direction of the right person to contact? I'm conducting some research and arduously pulling the data, and being able to stand on the shoulders of past giants would be super helpful. Thanks either way!

apendleton commented 3 years ago

@willjobs totally gone, unfortunately. The organization that employed me to work on it, and paid for all that infrastructure, is defunct, and other than someone keeping the Wordpress install up, the whole technical footprint is gone with it. If I had had the foresight, I would have walked out the door with everything on an external hard drive, but I didn't. That's all ~6 years ago now anyway, during which time regulations.gov has been rewritten, the number of documents there has more than doubled, etc.

Happy to chat, though, if you have questions about approaches or whatnot. Feel free to shoot me an email.