lostisland / faraday

Simple, but flexible HTTP client library, with support for multiple backends.
https://lostisland.github.io/faraday
MIT License
5.74k stars 976 forks source link

Clarify diff between connection settings `timeout` and `open_timeout` #1470

Closed Yu-Chieh-Henry-Yang closed 1 year ago

Yu-Chieh-Henry-Yang commented 1 year ago

Description

This PR aims to clarify the difference between the config settings timeout and open_timeout.

I am very confused about the difference between timeout and open_timeout.

Originally, the comment for these 2 config settings says:

  #   - `:timeout`  open/read timeout Integer in seconds
  #   - `:open_timeout` - read timeout Integer in seconds

Also when I was looking at this, I don't really know about TCP connection phase and reading phase and whether they are exclusive of each so searching online to clarify this is difficult as well. I found it difficult to clarify whether timeout actually includes the time in open_timeout and I have asked a question in here: https://github.com/lostisland/faraday/issues/1469.

The reason that I think timeout includes open_timeout is because:

  1. I saw faraday gem using the typhoeus adapter in this file: https://github.com/lostisland/faraday/blob/e7bbb4e7500590b13c47010546e8f75e19c19eb4/lib/faraday/adapter/typhoeus.rb#L12
  2. faraday's timeout is represented by timeout_ms in typhoeus and faraday's open_timeout is represented by connecttimeout_ms in typhoeus according to this file
  3. There is documentation regarding timeout in the typhoeus gem README.md. It says

timeout is the time limit for the entire request in seconds. connecttimeout is the time limit for just the connection phase, again in seconds.

Yu-Chieh-Henry-Yang commented 1 year ago

Thank you for looking into this and the comment.

I could be wrong, but "total timeout for both the connection-opening phase and the reading phase" might not be correct.

When I created this PR, I am not certain about what it does as well. If this might not be correct, do you know what would be correct instead? To be specific, let's say we have the following phase when we make a request using faraday:

Does timeout only represent the phase 2 and not (phase 1 + phase 2)?

Does open_timeout only represent phase 1? (If yes, why does it says read timeout Integer in seconds instead of open timeout Integer in seconds in the original code comment?)

Please let me know if have a mistake in understanding anywhere.

yykamei commented 1 year ago

Does timeout only represent the phase 2 and not (phase 1 + phase 2)?

timeout is just the default value for open_timeout, read_timeout, and write_timeout. If only timeout is set, open_timeout, read_timeout, and write_timeout will be set with the value of timeout.

This is where timeout is used in Faraday:

https://github.com/lostisland/faraday/blob/20a5d573f95f5cc1fb39d55629932f1ca90dbf78/lib/faraday/adapter.rb#L90

In faraday-net_http, read_timeout, write_timeout, and open_timeout are set with #request_timeout.

https://github.com/lostisland/faraday-net_http/blob/0f9970b095ef7ab847883809ad500f453094c88f/lib/faraday/adapter/net_http.rb#L152-L170

Does open_timeout only represent phase 1? (If yes, why does it says read timeout Integer in seconds instead of open timeout Integer in seconds in the original code comment?)

Yes in faraday-net_http, but I'm not sure why the code comment says "read timeout Integer in seconds". I think "open timeout Integer in seconds" is correct 👍

Yu-Chieh-Henry-Yang commented 1 year ago

For typhoeus adapter:

  1. faraday's timeout is represented by timeout_ms in typhoeus and faraday's open_timeout is represented by connecttimeout_ms in typhoeus according to this file
  2. There is documentation regarding timeout in the typhoeus gem README.md. It says

timeout is the time limit for the entire request in seconds. connecttimeout is the time limit for just the connection phase, again in seconds.

so I think the understanding around open_timeout is pretty consistent whereas the use and meaning of timeout varies between different adapters.

I will update the code for my PR to reflect that.

Yu-Chieh-Henry-Yang commented 1 year ago

In the future, it's probably better if we can expose open_timeout, read_timeout, write_timeout and total_timeout and default_timeout as configurable options so that adapters can use the correctly named options.

But for now I have expanded the explanation comment to point out that there is an uncertainty on how the timeout option is used.

iMacTia commented 1 year ago

OK so it looks like we have some outdated documentation in this file. We actually support 4 types of timeouts 😄

The revised documentation should note all 4 timeout types and then make clear that open_timeout, read_timeout and write_timeout are only supported in some adapters.

Yu-Chieh-Henry-Yang commented 1 year ago

Thanks for the explanation @iMacTia

According to the comment in https://github.com/lostisland/faraday/pull/1470#issuecomment-1357172732, it seems like the timeout value may be used differently in different adapter like faraday-net_http. For faraday-net_http, it seems to use timeout as the default value of open_timeout, read_timeout and write_timeout.

Would you say that it is using the timeout incorrectly? (I'm asking this just trying to understand timeout correctly)

faraday's #request_timeout method falls back to timeout value as we can see in here:

https://github.com/lostisland/faraday/blob/20a5d573f95f5cc1fb39d55629932f1ca90dbf78/lib/faraday/adapter.rb#L90

When both open_timeout and read_timeout are set via #request_timeout method, this cause the total timeout from open_timeout and read_timeout added together to exceed the value of timeout, should we perhaps remove the ability for this #request_timeout method to fall back to timeout? (Maybe in a future major release, since this is breaking change)

Yu-Chieh-Henry-Yang commented 1 year ago

I have updated the comment again according to our discussion, if the formatting or explanation is incorrect, please let me know. I'm happy to change them.

iMacTia commented 1 year ago

I haven't touched that code in a while, so a couple of questions come up on the Net::HTTP internals:

  1. Does it support a global timeout config? If not, then what we're doing is the most sensible thing and closest to what the user might possibly want.
  2. When does the read timeout begin to count? If it counts from the connection opening, then yeah we could potentially double the timeout. But maybe it's counting from the request start?
iMacTia commented 1 year ago

If Net::HTTP does indeed support a global timeout, then I agree we should set that instead, but I'd be surprised we're not using that already 🤔

Yu-Chieh-Henry-Yang commented 1 year ago

I see, thanks for the response, I don't know much about how the Net::HTTP works but this has answered my question.

I don't have the permission to merge this PR but I'm happy for this PR to be merged