We are happy to present the new 2.57.0 release of Beam.
This release includes both improvements and new functionality.
See the download page for this release.
Ensure that BigtableIO closes the reader streams (#31477).
New Features / Improvements
Added Feast feature store handler for enrichment transform (Python) (#30957).
BigQuery per-worker metrics are reported by default for Streaming Dataflow Jobs (Java) (#31015)
Adds inMemory() variant of Java List and Map side inputs for more efficient lookups when the entire side input fits into memory.
Beam YAML now supports the jinja templating syntax.
Template variables can be passed with the (json-formatted) --jinja_variables flag.
DataFrame API now supports pandas 2.1.x and adds 12 more string functions for Series.(#31185).
Added BigQuery handler for enrichment transform (Python) (#31295)
Disable soft delete policy when creating the default bucket for a project (Java) (#31324).
Added DoFn.SetupContextParam and DoFn.BundleContextParam which can be used
as a python DoFn.process, Map, or FlatMap parameter to invoke a context
manager per DoFn setup or bundle (analogous to using setup/teardown
or start_bundle/finish_bundle respectively.)
Go SDK Prism Runner
Pre-built Prism binaries are now part of the release and are available via the Github release page. (#29697).
Some pipelines will work on Java and Python, but this is in part to prepare for real runner wrappers in 2.58.0
ProcessingTime is now handled synthetically with TestStream pipelines and Non-TestStream pipelines, for fast test pipeline execution by default. (#30083).
Prism does NOT yet support "real time" execution for this release.
Improve processing for large elements to reduce the chances for exceeding 2GB protobuf limits (Python)([https://redirect.github.com/apache/beam/issues/31607]).
Breaking Changes
Java's View.asList() side inputs are now optimized for iterating rather than
indexing when in the global window.
This new implementation still supports all (immutable) List methods as before,
but some of the random access methods like get() and size() will be slower.
To use the old implementation one can use View.asList().withRandomAccess().
SchemaTransforms implemented with TypedSchemaTransformProvider now produce a
configuration Schema with snake_case naming convention
(#31374). This will make the following
cases problematic:
Running a pre-2.57.0 remote SDK pipeline containing a 2.57.0+ Java SchemaTransform,
and vice versa:
Ensure that BigtableIO closes the reader streams (#31477).
New Features / Improvements
Added Feast feature store handler for enrichment transform (Python) (#30957).
BigQuery per-worker metrics are reported by default for Streaming Dataflow Jobs (Java) (#31015)
Adds inMemory() variant of Java List and Map side inputs for more efficient lookups when the entire side input fits into memory.
Beam YAML now supports the jinja templating syntax.
Template variables can be passed with the (json-formatted) --jinja_variables flag.
DataFrame API now supports pandas 2.1.x and adds 12 more string functions for Series.(#31185).
Added BigQuery handler for enrichment transform (Python) (#31295)
Disable soft delete policy when creating the default bucket for a project (Java) (#31324).
Added DoFn.SetupContextParam and DoFn.BundleContextParam which can be used
as a python DoFn.process, Map, or FlatMap parameter to invoke a context
manager per DoFn setup or bundle (analogous to using setup/teardown
or start_bundle/finish_bundle respectively.)
Go SDK Prism Runner
Pre-built Prism binaries are now part of the release and are available via the Github release page. (#29697).
ProcessingTime is now handled synthetically with TestStream pipelines and Non-TestStream pipelines, for fast test pipeline execution by default. (#30083).
Prism does NOT yet support "real time" execution for this release.
Improve processing for large elements to reduce the chances for exceeding 2GB protobuf limits (Python)([https://redirect.github.com/apache/beam/issues/31607]).
Breaking Changes
Java's View.asList() side inputs are now optimized for iterating rather than
indexing when in the global window.
This new implementation still supports all (immutable) List methods as before,
but some of the random access methods like get() and size() will be slower.
To use the old implementation one can use View.asList().withRandomAccess().
SchemaTransforms implemented with TypedSchemaTransformProvider now produce a
configuration Schema with snake_case naming convention
(#31374). This will make the following
cases problematic:
Running a pre-2.57.0 remote SDK pipeline containing a 2.57.0+ Java SchemaTransform,
and vice versa:
Running a 2.57.0+ remote SDK pipeline containing a pre-2.57.0 Java SchemaTransform
All direct uses of Python's SchemaAwareExternalTransform
should be updated to use new snake_case parameter names.
Upgraded Jackson Databind to 2.15.4 (Java) (#26743).
jackson-2.15 has known breaking changes. An important one is it imposed a buffer limit for parser.
If your custom PTransform/DoFn are affected, refer to #31580 for mitigation.
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot show ignore conditions` will show all of the ignore conditions of the specified dependency
- `@dependabot ignore major version` will close this group update PR and stop Dependabot creating any more for the specific dependency's major version (unless you unignore this specific dependency's major version or upgrade to it yourself)
- `@dependabot ignore minor version` will close this group update PR and stop Dependabot creating any more for the specific dependency's minor version (unless you unignore this specific dependency's minor version or upgrade to it yourself)
- `@dependabot ignore ` will close this group update PR and stop Dependabot creating any more for the specific dependency (unless you unignore this specific dependency or upgrade to it yourself)
- `@dependabot unignore ` will remove all of the ignore conditions of the specified dependency
- `@dependabot unignore ` will remove the ignore condition of the specified dependency and ignore conditions
Bumps the go-deps group with 5 updates:
2.56.0
2.57.0
0.185.0
0.187.0
0.0.0-20240617180043-68d350f18fd4
0.0.0-20240624140628-dc46fd24d27d
0.0.0-20240610135401-a8a62080eff3
0.0.0-20240617180043-68d350f18fd4
0.0.0-20240617180043-68d350f18fd4
0.0.0-20240624140628-dc46fd24d27d
Updates
github.com/apache/beam/sdks/v2
from 2.56.0 to 2.57.0Release notes
Sourced from github.com/apache/beam/sdks/v2's releases.
... (truncated)
Changelog
Sourced from github.com/apache/beam/sdks/v2's changelog.
Commits
e3314d4
Set version for 2.57.0 RC196766f2
Merge pull request #31626: [release-2.57.0] Cherrypick #31490 into the releas...40653e7
Merge pull request #31642: Use correct name for tox doc task.4b132b2
Use correct name for tox doc task.2ff8b61
Plumb Redistribute "allow duplicates" property to Dataflow (#31490)df480af
Merge pull request #31562: Cherrypick #31550 and #31602 onto release branchf1da72d
Fix internal test failure caused by PR 31550 (#31602)6df1214
Merge pull request #31589: [Release-2.57.0] Cherry-pick #31580 into release b...d64f8ca
Merge pull request #31598: [release-2.57.0] Cherrypick #31581 into the releas...d7da9d3
Use a type-compliant sentinel.Updates
google.golang.org/api
from 0.185.0 to 0.187.0Release notes
Sourced from google.golang.org/api's releases.
Changelog
Sourced from google.golang.org/api's changelog.
Commits
b6c87f6
chore(main): release 0.187.0 (#2656)e051997
fix: pass through gRPC api key option to new auth lib (#2664)2ea4e07
chore(all): update all to dc46fd2 (#2662)6e061ce
feat(all): auto-regenerate discovery clients (#2663)0a238f5
feat(all): auto-regenerate discovery clients (#2661)3ca2f84
feat(all): auto-regenerate discovery clients (#2660)7cd88da
feat(all): auto-regenerate discovery clients (#2659)a758bc1
fix(gensupport): wrap chunk upload err for retries (#2657)719f988
feat(all): auto-regenerate discovery clients (#2658)1a28e06
feat(all): auto-regenerate discovery clients (#2655)Updates
google.golang.org/genproto
from 0.0.0-20240617180043-68d350f18fd4 to 0.0.0-20240624140628-dc46fd24d27dCommits
Updates
google.golang.org/genproto/googleapis/api
from 0.0.0-20240610135401-a8a62080eff3 to 0.0.0-20240617180043-68d350f18fd4Commits
Updates
google.golang.org/genproto/googleapis/rpc
from 0.0.0-20240617180043-68d350f18fd4 to 0.0.0-20240624140628-dc46fd24d27dCommits
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot show