unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
107 stars 21 forks source link

Switch away from shell=True #84

Closed konklone closed 10 years ago

konklone commented 10 years ago

The call to pdftotext uses shell=True, and it interpolate file paths into the command.

Either ditch shell=True, or at least demonstrate that an OIG's office could not hack a server running this scraper by crafting malicious filenames.

konklone commented 10 years ago

I unfortunately had to revert his in https://github.com/unitedstates/inspectors-general/commit/9b64dd3ea5bcfa758a40fb2b0bb231a8a7259b8f, as it was causing errors with relative paths. I should have tested this more thoroughly before merging #87, that was my bad. I suspect this worked fine on Windows, which is why it wasn't caught beforehand.

divergentdave commented 10 years ago

@konklone, could you try running #111 to see if it works on your machine? Also, is data_directory set in your config?

konklone commented 10 years ago

Fixed by #111.