misja / python-boilerpipe

Python interface to Boilerpipe, Boilerplate Removal and Fulltext Extraction from HTML pages
Other
539 stars 143 forks source link

gzip pages decompressor added #44

Open mahdi-saberi opened 7 years ago

mahdi-saberi commented 7 years ago

some websites have gzip enabled on their webservers! e.g. http://www.yjc.ir/fa/news/6072181/

so we need to have gzip decompressor