We are happy to present the new 2.34.0 release of Beam.
This release includes both improvements and new functionality.
See the download page for this release.
The Beam Java API for Calcite SqlTransform is no longer experimental (BEAM-12680).
Python's ParDo (Map, FlatMap, etc.) transforms now suport a with_exception_handling option for easily ignoring bad records and implementing the dead letter pattern.
I/Os
ReadFromBigQuery and ReadAllFromBigQuery now run queries with BATCH priority by default. The query_priority parameter is introduced to the same transforms to allow configuring the query priority (Python) (BEAM-12913).
[EXPERIMENTAL] Support for BigQuery Storage Read API added to ReadFromBigQuery. The newly introduced method parameter can be set as DIRECT_READ to use the Storage Read API. The default is EXPORT which invokes a BigQuery export request. (Python) (BEAM-10917).
[EXPERIMENTAL] Added use_native_datetime parameter to ReadFromBigQuery to configure the return type of DATETIME fields when using ReadFromBigQuery. This parameter can only be used when method = DIRECT_READ(Python) (BEAM-10917).
Added a new dataframe extra to the Python SDK that tracks pandas versions
we've verified compatibility with. We now recommend installing Beam with pip install apache-beam[dataframe] when you intend to use the DataFrame API
(BEAM-12906).
Add an example of deploying Python Apache Beam job with Spark Cluster
[Go SDK] beam.TryCrossLanguage's signature now matches beam.CrossLanguage. Like other Try functions it returns an error instead of panicking. (BEAM-9918).
BEAM-12925 was fixed. It used to silently pass incorrect null data read from JdbcIO. Pipelines affected by this will now start throwing failures instead of silently passing incorrect data.
Bugfixes
Fixed error while writing multiple DeferredFrames to csv (Python) (BEAM-12701).
Fixed error when importing the DataFrame API with pandas 1.0.x installed (BEAM-12945).
Fixed top.SmallestPerKey implementation in the Go SDK (BEAM-12946).
List of Contributors
According to git shortlog, the following people contributed to the 2.34.0 release. Thank you to all contributors!
Ahmet Altay,
Aizhamal Nurmamat kyzy,
Alex Amato,
Alexander Chermenin,
The Beam Java API for Calcite SqlTransform is no longer experimental (BEAM-12680).
Python's ParDo (Map, FlatMap, etc.) transforms now suport a with_exception_handling option for easily ignoring bad records and implementing the dead letter pattern.
I/Os
ReadFromBigQuery and ReadAllFromBigQuery now run queries with BATCH priority by default. The query_priority parameter is introduced to the same transforms to allow configuring the query priority (Python) (BEAM-12913).
[EXPERIMENTAL] Support for BigQuery Storage Read API added to ReadFromBigQuery. The newly introduced method parameter can be set as DIRECT_READ to use the Storage Read API. The default is EXPORT which invokes a BigQuery export request. (Python) (BEAM-10917).
[EXPERIMENTAL] Added use_native_datetime parameter to ReadFromBigQuery to configure the return type of DATETIME fields when using ReadFromBigQuery. This parameter can only be used when method = DIRECT_READ(Python) (BEAM-10917).
Added a new dataframe extra to the Python SDK that tracks pandas versions
we've verified compatibility with. We now recommend installing Beam with pip install apache-beam[dataframe] when you intend to use the DataFrame API
(BEAM-12906).
Add an example of deploying Python Apache Beam job with Spark Cluster
[Go SDK] beam.TryCrossLanguage's signature now matches beam.CrossLanguage. Like other Try functions it returns an error instead of panicking. (BEAM-9918).
BEAM-12925 was fixed. It used to silently pass incorrect null data read from JdbcIO. Pipelines affected by this will now start throwing failures instead of silently passing incorrect data.
Bugfixes
Fixed error while writing multiple DeferredFrames to csv (Python) (BEAM-12701).
Fixed error when importing the DataFrame API with pandas 1.0.x installed (BEAM-12945).
Fixed top.SmallestPerKey implementation in the Go SDK (BEAM-12946).
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps apache-beam[gcp] from 2.33.0 to 2.34.0.
Release notes
Sourced from apache-beam[gcp]'s releases.
... (truncated)
Changelog
Sourced from apache-beam[gcp]'s changelog.
Commits
b3b1843
Set version for 2.34.0 RC2bec9149
Merge pull request #15906 from ibzib/BEAM-13143-cp7e777d8
Merge pull request #15902 from TheNeuralBit/BEAM-13187-cp86e2e0e
[BEAM-13143] Fix python doc generator error.c3e6521
Merge pull request #15891: [BEAM-13187] Set filesToStage after full jar resol...621a4f9
Merge pull request #15806 from ibzib/parquet-cp2d655ea
[BEAM-13104] ParquetIO: SplitReadFn must read the whole blockf3c1a19
Merge pull request #15791 from ibzib/dicom-cpfd0338c
[BEAM-12694] Include datetime in dicom test dataset name.9c729f2
Merge pull request #15745 from ibzib/triggerDependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)