Cool data at https://www.gov.mb.ca/chc/archives/hbca/biographical/index.html, but it is stuck in pdfs
0
stars
1
forks
source link
Consider pre-processing files for common issues that I don't want to solve with regex #9
Open
eebbesen opened 8 years ago
Solve common formatting issues from the PDF conversion. This may make the regex less obtuse.d
Ne AME
instead ofNAME
*An Outfit year ran from 1 June to 31 May
and other common text I don't care about