socketry / async-http

MIT License
298 stars 45 forks source link

What is the recommended retry logic for Async::HTTP::Internet? #161

Closed fangherk closed 3 weeks ago

fangherk commented 3 weeks ago

I'm playing with an example snippet below:

require "net/http"
require "async"
require "async/http/internet"
require "async/barrier"
require "async/semaphore"

RETRIES = 10
N_TIMES = 100_000
SEMAPHORE_LIMIT = 10_000

async_timings =
  Async do |task|
    internet = Async::HTTP::Internet.new
    barrier = Async::Barrier.new
    semaphore = Async::Semaphore.new(SEMAPHORE_LIMIT, parent: barrier)

    results =
      N_TIMES.times.map do |i|
        semaphore.async(parent: barrier) do
          puts "Run #{i}"
          retry_count = 0
          response = nil
          while retry_count < RETRIES
            begin
              response = internet.get("https://httpbin.org/delay/0.5")
              break if response
            rescue StandardError => e
              puts "Failed attempt #{retry_count + 1}: #{e.message}"
              sleep(0.5)
            end
            retry_count += 1
          end
          i if response
        end
      end
    barrier.wait
    results = results.map(&:result)
    puts results.compact.uniq.size
  end

Currently, I have some retry logic in case the internet request fails. In that case, I rerun the request multiple times until it succeeds. I am expecting that if I have some low number for the semaphore limit, I wouldn't see too many instances of errors like Broken Pipe or Stream Closed. I tried modifying the size of the semaphore limit, but still see that the upper limit is ~100 before I start seeing quite a few errors. Just curious, is there a better way to tune these numbers based on your own system and is there a recommended retry protocol for http requests with Async::HTTP::Internet?

ioquatix commented 3 weeks ago

If the request fails for non-fatal reasons, it will already be retried, e.g. https://github.com/socketry/async-http/blob/main/lib/async/http/client.rb#L117

You shouldn't need to implement your own retry logic, the default should be sufficient. However, you might want to handle rate limiting differently.

The semaphore limit is limiting the number of requests to the specific host, but internally, Async::HTTP::Internet will limit the number of active connections to a given host.

See https://github.com/socketry/async-http/blob/019310e275b7b892e297b74d67cb09e33414bc59/lib/async/http/client.rb#L33 - specifically the connection limit and retries.

fangherk commented 3 weeks ago

Hmm, so I tried digging deeper and using pry, I found a different error being raised.

In the code over here, this is what I added: https://github.com/socketry/async-http/blob/main/lib/async/http/client.rb#L122-138

rescue SocketError, IOError, EOFError, Errno::ECONNRESET, Errno::EPIPE
  if connection
    @pool.release(connection)
    connection = nil
  end

  if request.idempotent? and attempt < @retries
    retry
  else
    raise
  end
rescue StandardError => e
  binding.pry # checked the result here
ensure
  if connection
    @pool.release(connection)
  end
end

This is still the code from the original post and it is producing #<Protocol::HTTP2::StreamError: Stream closed!>. Is this considered a non-fatal retry reason?

ioquatix commented 3 weeks ago

I'll take a closer look at this tomorrow.

StreamError in this context, with the message "Stream closed" probably indicates that the remote server is closing the stream. HTTP/2 advertises a maximum concurrency count, which we respect, but perhaps the remote server has some custom rate limiting.

You can try running with more debug output to see if it reveals anything:

CONSOLE_LEVEL=debug ./test.rb
ioquatix commented 3 weeks ago

Okay, I was digging around:

irb(#<Async::HTTP::Protocol::HTTP...):004> self.remote_settings
=> 
#<Protocol::HTTP2::Settings:0x000000011f07c0e0
 @enable_connect_protocol=0,
 @enable_push=1,
 @header_table_size=4096,
 @initial_window_size=65536,
 @maximum_concurrent_streams=128,
 @maximum_frame_size=16777215,
 @maximum_header_list_size=4294967295>

but...

> self.count
=> 1000

Basically, the connection pool is not load balancing correctly. This looks like a bug, I'll try to make a failing test case tomorrow and then we can figure out a fix.

fangherk commented 3 weeks ago

Oh interesting, it seems like the issue is that the Async Client isn't respecting @maximum_concurrent_streams. So, it's going over that value with the count, but the client should be able to figure that out after the frames update from the remote settings (after here: https://github.com/socketry/protocol-http2/blob/main/lib/protocol/http2/connection.rb#L266-L274). Then, I guess it will automatically load balance to that value.

In this case, would you expect setting Async::HTTP::Internet.new(connection_limit: 128) to also handle the issue? I was testing that, but I still need the set the SEMAPHORE_LIMIT to 128 to not run into any stream closed errors.

ioquatix commented 3 weeks ago

That all sounds reasonable... I'll try to have a fix for the connection pool today.

ioquatix commented 3 weeks ago

Okay, I've released async-http v0.67.1 which includes the fix.

I've also added an example: https://github.com/socketry/async-http/blob/main/examples/delay/client.rb

fangherk commented 3 weeks ago

Ran it on the new version, it works. Thank you for fixing this! 🎉

ioquatix commented 3 weeks ago

Thanks for reporting the problem!