unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
107 stars 21 forks source link

Use scrapelib's urlretrieve() on binary files #149

Closed divergentdave closed 10 years ago

divergentdave commented 10 years ago

This improves performance dramatically, since urlopen() was running large PDF files through character set detection models.

(Note that for testing reasons I had to branch off of #147 instead of master)

konklone commented 10 years ago

I cannot believe how much faster you just made downloading, and how blind I was not to see how slow it's been all this time. Thank you so much for identifying and fixing this!

audiodude commented 10 years ago

I just want to point out for posterity that at one point, I asked @konklone why my download was taking epic lengths of time and he somehow convinced me that the IG website was just that slow.

Shame on you, @konklone!

spulec commented 10 years ago

This is amazing!