misja / python-boilerpipe

Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages
Other
539 stars 143 forks source link

urllib2 headers changed #25

Open rshiva opened 10 years ago

rshiva commented 10 years ago

urllib2 headers changed from Mozilla/5.0 to Mozilla since it was falling for some website give a 406 error

For more check this issue https://github.com/misja/python-boilerpipe/issues/24

tuxdna commented 7 years ago

@rshiva If this PR is still applicable, please resolve the conflicts and update.

Also please add a test case so this can be detected in future.