apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.81k stars 4.23k forks source link

[yaml] Normalize SpannerIO #28686

Open Polber opened 1 year ago

Polber commented 1 year ago

What would you like to happen?

This bug is part of the Beam YAML milestone. More information about Beam YAML can be found here: https://github.com/apache/beam/tree/master/sdks/python/apache_beam/yaml#beam-yaml-api

A Beam YAML contribution guide can be found here: https://s.apache.org/beam-yaml-contribute


This IO is one of the many Beam IO’s that need to be normalized to be compatible with Beam YAML. Normalization here means that the IO should be consistent across Python and Java SDK in a way that can be exposed to the Beam YAML framework.

For Java, this means taking advantage of the Cross Language framework which allows transforms to be exposed to all the SDK’s through the use of the External Transform service.

For Python, this means ensuring that any existing Python implementations have the same functionality and expose the same parameters as their Java counterparts.

After normalization, an IO transform written in Java and Python can be interchanged freely without compromising any functionality within the rest of the pipeline.

A guide on how to normalize the IO’s can be found here: https://docs.google.com/document/d/1oXqMxE4Gl4Uj3dBuVx1qgbVRxrVWuPFDYW0QRRLxQyE/edit

Issue Priority

Priority: 2 (default / most feature requests should be filed as P2)

Issue Components

Polber commented 6 months ago

.take-issue