googleapis / python-pubsub

Apache License 2.0
393 stars 206 forks source link

PubSub: support batching publish requests with asyncio #20

Open relud opened 5 years ago

relud commented 5 years ago

Is your feature request related to a problem? Please describe.

I have an asyncio application that needs to publish messages to PubSub, but I'm having issues because google.cloud.pubsub.PublisherClient.publish:

  1. returns futures that aren't compatible with await or asyncio.wrap_future
  2. returns futures that never complete if Batch._commit throws an uncaught exception (like in googleapis/google-cloud-python#7103 and googleapis/google-cloud-python#7071)
  3. doesn't enforce a maximum number of threads, which is eating memory

Describe the solution you'd like

I wrote a new google.cloud.pubsub_v1.publisher._batch.async.Batch that implements google.cloud.pubsub_v1.publisher._batch.base.Batch. It uses asyncio to provide awaitable futures that automatically propagate exceptions. It uses a shared concurrent.futures.ThreadPoolExecutor in conjunction with asyncio.wrap_future to asynchronously call Batch.client.publish while enforcing a maximum number of workers. I specifically only wrapped Batch.client.publish in a thread because (if i understand correctly) it only blocks on exclusive access to the grpc channel, so it shouldn't create performance issues as seen in the first alternative below.

I would like to submit this as a pull request, but only if it would be useful.

Describe alternatives you've considered

relud commented 5 years ago

For the record, this is my current implementation of an asyncio batch: https://github.com/mozilla/gcp-ingestion/blob/24c1cea/ingestion-edge/ingestion_edge/util.py#L7-L95

anguillanneuf commented 5 years ago

I haven't tried this myself, but there's a proposed solution to make publish future act like a concurrent Future: https://github.com/googleapis/google-cloud-python/issues/6201#issuecomment-472155433

plamut commented 5 years ago

Python 2 support is deprecated, but it still needs to be preserved until the end of the year. Any support for asyncio will thus have to wait for at least another 6 months or so.

agates4 commented 5 years ago

do we have any update on this

matehat commented 4 years ago

Now that we're past the date when python 2 support is officially dropped, can we have an update on this? Any timeline for the official asyncio support?

I'm sure we're not alone in trying to use google cloud pubsub with asyncio-based libraries.

meredithslota commented 4 years ago

We just released a new version of Pub/Sub that drops python 2.7 (and 3.5) support: https://github.com/googleapis/python-pubsub/releases/tag/v2.0.0 about a week ago.