Closed mcrumm closed 5 years ago
I like option 3 because I think we should bring such concerns up and shield the users away from them. Otherwise we have broadway_gcpb that usese pubsub that uses tesla and at this point the user is many layers deep in figuring out how to make everything work. By keeping it internal as well, we can change it back to option 2 once it is out.
Also, I believe amazon sqs also defaults to hackney, so maybe we can discuss a unified configuration/api for passing down options to the http client. It seems another thing we may want to do is to always configure the http client pool to have at least the number of producers + number of processors. Although I have seen some people setting up thousands of processors, so we may want a cap on that. maybe number of producers + 2 * number of cores?
@josevalim are you thinking we should hackney_pool:start_pool
for the pipeline? Is there a good place for this to live? Currently the only place I can think to put it would be in the producer's start_link, but wouldn't that require a separate pool per producer stage?
Maybe start_pool is idempotent? So we can always call it from eacch producer? But we can also add a new broadway callback if necessary that is invoked once for all producers... but I guess that would still need idempotency because of restarts and what not.
@josevalim good call, at least on the outside it looks like it's idempotent indeed:
iex(11)> :hackney_pool.start_pool(:my_pool, [])
:ok
iex(12)> :hackney_pool.start_pool(:my_pool, [])
:ok
iex(13)> :hackney_pool.start_pool(:my_pool, [])
:ok
iex(14)> :hackney_pool.start_pool(:my_pool, [])
:ok
iex(15)> {:ok, 200, _, ref} = :hackney.request(:get, "https://httpbin.org/json", [], "", pool: :my_pool); :hackney.body(ref)
{:ok,
"{\n \"slideshow\": {\n \"author\": \"Yours Truly\", \n \"date\": \"date of publication\", \n \"slides\": [\n {\n \"title\": \"Wake up to WonderWidgets!\", \n \"type\": \"all\"\n }, \n {\n \"items\": [\n \"Why <em>WonderWidgets</em> are great\", \n \"Who <em>buys</em> WonderWidgets\"\n ], \n \"title\": \"Overview\", \n \"type\": \"all\"\n }\n ], \n \"title\": \"Sample Slide Show\"\n }\n}\n"}
iex(16)> {:ok, 200, _, ref} = :hackney.request(:get, "https://httpbin.org/json", [], "", pool: :my_pool); :hackney.body(ref)
{:ok,
"{\n \"slideshow\": {\n \"author\": \"Yours Truly\", \n \"date\": \"date of publication\", \n \"slides\": [\n {\n \"title\": \"Wake up to WonderWidgets!\", \n \"type\": \"all\"\n }, \n {\n \"items\": [\n \"Why <em>WonderWidgets</em> are great\", \n \"Who <em>buys</em> WonderWidgets\"\n ], \n \"title\": \"Overview\", \n \"type\": \"all\"\n }\n ], \n \"title\": \"Sample Slide Show\"\n }\n}\n"}
There's also a way to use custom pool.
@wojtekmach sweet, thanks for checking on that!
I'll be honest, I felt a little out of my depth looking at the hackney_disp example for a custom pool handler, but if a custom handler feel like the way to go, I can start diving deeper into that.
I think a regular pool would be just fine! :D --
José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D
I was just mentioning the custom pool option for completeness, agreed about using regular one!
On 6 Aug 2019, at 19:56, José Valim notifications@github.com wrote:
I think a regular pool would be just fine! :D --
José Valimwww.plataformatec.com.br http://www.plataformatec.com.br/Founder and Director of R&D — You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/plataformatec/broadway_cloud_pub_sub/issues/18?email_source=notifications&email_token=AAASSJ3XQO3WEF6SM6LOCE3QDG3L3A5CNFSM4IJSR2XKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3V6QKY#issuecomment-518776875, or mute the thread https://github.com/notifications/unsubscribe-auth/AAASSJ6KBYL4DS7AYZ5DL3LQDG3L3ANCNFSM4IJSR2XA.
The Problem
Using the default
GoogleApiClient
withoutreturn_immediately: true
, there is a high likelihood that messages will not be acknowledged before they reach their deadline.Background Info
If there are no messages in the topic queue, the subscription "pull request" will be held open for a "bounded amount of time", unless the user specifies return_immediately: true, at which point the server will respond immediately with an empty list† and the producer will have to wait the entire
:receive_interval
before beginning another pull request.Currently, an acknowledge request can get blocked by an in-flight pull request, at which point the messages being acknowledged are almost certainly going to be returned to the queue, as they will have exceeded their deadline by the time the acknowledge request is actually sent.
As laid out in googleapis/elixir-google-api#98, all GoogleApi connections currently default to using Tesla with the httpc adapter. As of the latest version of Gax (0.1.3), this behavior is still not directly configurable.
Possible Solutions
1) Recommend global adapter configuration
We could recommend that users install and configure a global adapter for Tesla. This would resolve the issues with httpc, but it still won't allow for per-request adapter configuration, which we really need so we can hold open the pull request until a message arrives or the server disconnects us.
2) Wait for vendor adoption
We could wait for Gax to support Tesla 1.2, at which point we should be able to provide runtime adapter configuration. If/when that happens, the next question would be how much of this adapter-related configuration should be exposed to end users?
3) Ship with a working adapter
Ideally, as an end user, I can ignore all of the adapter complexity and just focus on handling messages. One way for that to be true would be to add the following dependencies:
and then inject the adapter into the Tesla.Client after it's been created:
so that we can set adapter options at runtime:
Steps to Reproduce:
BroadwayCloudPubSub.Producer
.:base_url
for Pub/Sub in the app configuration:† When the queue is empty, the PullResponse struct actually returns nil for receivedMessages, which breaks the typespec, but I digress.
Thoughts? @wojtekmach @msaraiva @josevalim