unitedstates / inspectors-general

Collecting reports from Inspectors General across the US federal government.
https://sunlightfoundation.com/blog/2014/11/07/opengov-voices-opening-up-government-reports-through-teamwork-and-open-data/
Creative Commons Zero v1.0 Universal
107 stars 21 forks source link

Improve HTML text extraction #110

Closed divergentdave closed 10 years ago

divergentdave commented 10 years ago

Strip out <script> and <style>, strip whitespace from each line, and collapse consecutive newlines. See #91.