eebbesen / hbcb_rails

Cool data at https://www.gov.mb.ca/chc/archives/hbca/biographical/index.html, but it is stuck in pdfs
MIT License
0 stars 1 forks source link

Consider pre-processing files for common issues that I don't want to solve with regex #9

Open eebbesen opened 8 years ago

eebbesen commented 8 years ago

Solve common formatting issues from the PDF conversion. This may make the regex less obtuse.d