CIP Archive - Previous Aldermanic Menu Program Books by Year section
Yes, someone has already asked the city if this data is available in a CSV file. It is not.
Write Python functions to convert the text in the PDFs to a CSV format. You can do OCR with pytesseract, but the you might be able to get the text directly out of the PDF file using PyMuPDF or something other library. The PDFs for different years have different formats.
[x] Double check that the raw data is not available on the Open Data Portal or anywhere else
[x] Create functions to get raw text from PDFs
[ ] Create functions to convert the raw text into structured data for the different PDF formats
[x] 2019+
[ ] 2017-2018
[ ] 2012-2016
[x] #1
[x] Convert project location descriptions into GeoJSON objects (Ex: "ON N MAPLEWOOD AVE FROM W BELDEN AV (2300 N) TO W MEDILL AV (2334 N)" should be a line between those two spots)
CIP Archive - Previous Aldermanic Menu Program Books by Year section
Yes, someone has already asked the city if this data is available in a CSV file. It is not.
Write Python functions to convert the text in the PDFs to a CSV format. You can do OCR with pytesseract, but the you might be able to get the text directly out of the PDF file using PyMuPDF or something other library. The PDFs for different years have different formats.