jnunemaker / httparty

:tada: Makes http fun again!
MIT License
5.81k stars 964 forks source link

Preventing large payloads? #795

Closed fauno closed 9 months ago

fauno commented 9 months ago

Hi! I wanted to implement download limits for requests because I'm working with untrusted sources.

I tried this:

content_length = 0

HTTParty.get(url) do |fragment|
  content_length += fragment.bytesize

  raise "limit reached" if content_length > 1_000_000
end

But it doesn't work for compressed gzip and brotli data, because both Net::Http (Ruby 3.1) and HTTParty (0.21.0) will decompress after downloading the whole body into memory. Is there a way to achieve something like this that includes compressed data?

Edit: mention compression algorithms

jnunemaker commented 9 months ago

Would setting stream_body: false option help?

https://github.com/jnunemaker/httparty/blob/master/lib/httparty/request.rb#L156-L173

I think that would avoid building up the entire body into a string in response. Closing to keep things tidy but happy to keep trying to help.

fauno commented 9 months ago

Thanks! I thought stream_body defaulted to false, so I tried stream_body: true and it works for gzip because Net::Http would stream the decompressed body.

content_length = 0

HTTParty.get(url, stream_body: true) do |fragment|
  content_length += fragment.bytesize

  raise "limit reached" if content_length > 1_000_000
end

Then I added brotli decompression, but 2GB of zeroes compress to 1.6KB (while gzip is 1.9MB) and the entire payload fits into a single call to Brotli.inflate, which defeats the purpose. Probably should try with another gem, but ignoring content-encoding: br will do for now.

Here's the final try:

require 'httparty'
require 'brotli'

body = ''.dup
HTTParty.get(url, stream_body: true, headers: { 'accept-encoding': 'br' }) do |fragment|
  case fragment.http_response['content-encoding']
  when 'br'
    body << Brotli.inflate(fragment) # inflates 1.6KB fragment into a 2GB body
  else
    body << fragment
  end

  raise 'limit reached' if body.bytesize > 1_000_000
end
fauno commented 9 months ago

Got it!

require 'httparty'
require 'brs'

body = ''.dup
HTTParty.get(url, stream_body: true, headers: { 'accept-encoding': 'br,gzip' }) do |fragment|
  case fragment.http_response['content-encoding']
  when 'br'
    BRS::Stream::Reader.new(StringIO.new(fragment), source_buffer_length: 256, destination_buffer_length: 1024).each_char do |char|
      body << char

      raise "limit" if body.bytesize > 1_000_000
    end
  else
    body << fragment
  end

  raise "limit" if body.bytesize > 1_000_000
end