apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.8k stars 4.23k forks source link

[Task]: Remove archaic PubSubSource and PubSubSink #27443

Open robertwb opened 1 year ago

robertwb commented 1 year ago

What needs to happen?

Reading from and writing to PubSub should be proper (if primitive) transforms, rather than being represented by "special" parameters of Read/Write operations necessitating workarounds like https://github.com/apache/beam/blob/release-2.48.0/sdks/python/apache_beam/io/iobase.py#L946.

Issue Priority

Priority: 3 (nice-to-have improvement)

Issue Components

Abacn commented 2 weeks ago

Does this mean currently Python PubsubIO can only run on Dataflow runner (in addition to Python direct runner as there is a local overwrite) ?

robertwb commented 2 weeks ago

Correct, currently they only run on Dataflow (that swaps out PubSub) and the olde Python direct runner (that also swaps them out).