Closed konklone closed 10 years ago
I unfortunately had to revert his in https://github.com/unitedstates/inspectors-general/commit/9b64dd3ea5bcfa758a40fb2b0bb231a8a7259b8f, as it was causing errors with relative paths. I should have tested this more thoroughly before merging #87, that was my bad. I suspect this worked fine on Windows, which is why it wasn't caught beforehand.
@konklone, could you try running #111 to see if it works on your machine? Also, is data_directory set in your config?
Fixed by #111.
The call to
pdftotext
usesshell=True
, and it interpolate file paths into the command.Either ditch
shell=True
, or at least demonstrate that an OIG's office could not hack a server running this scraper by crafting malicious filenames.