Gzip decoded body not used anywhere

taganaka / polipus

Polipus: distributed and scalable web-crawler framework

MIT License

92 stars 32 forks source link

At HTTP#fetch_pages you try to decode the gziped content of a page.

https://github.com/taganaka/polipus/blob/master/lib/polipus/http.rb#L34-L39

          body = response.body.dup
          if response.to_hash.fetch('content-encoding', [])[0] == 'gzip'
            gzip = Zlib::GzipReader.new(StringIO.new(body))    
            body = gzip.read
          end
          pages << Page.new(location, :body          => response.body.dup,

but body is not used anywhere. :body should get it's value.

In general, I'm not sure it this necessary at all, as http://www.ruby-doc.org/stdlib-2.1.1/libdoc/net/http/rdoc/Net/HTTP.html#class-Net::HTTP-label-Compression states this is done by Net::HTTP automatically

taganaka / polipus

Gzip decoded body not used anywhere #20