fgregg / legistar-scrape

Legistar Scraper is a python library for scraping Legistar sites -- legislation management sites hosted by by Granicus.
MIT License
23 stars 16 forks source link

PDF scraping #22

Closed fgregg closed 7 years ago

fgregg commented 11 years ago

Much of this legislation in Chicago is boilerplate and amenable to scraping. I'd like to start trying to scrape zoning applications. I.e. https://chicago.legistar.com/View.ashx?M=F&ID=2617927&GUID=0C5110D0-AF3C-487E-88BB-630311654237

I think pdfminer is the tool we want to use here https://github.com/mcs07/pdfminer

This should be a separate library.