A set of tools to allow PDF to XML conversion, utilising Apache Beam and other tools. The aim of this project is to bring multiple tools together to generate a full XML document. It is now mainly used for evaluation purpose of external tools.
We are happy to present the new 2.36.0 release of Apache Beam.
This release includes both improvements and new functionality.
See the download page for this release.
Support for stopReadTime on KafkaIO SDF (Java).(BEAM-13171).
New Features / Improvements
Added support for cloudpickle as a pickling library for Python SDK (BEAM-8123). To use cloudpickle, set pipeline option: --pickler_lib=cloudpickle
Added option to specify triggering frequency when streaming to BigQuery (Python) (BEAM-12865).
Added option to enable caching uploaded artifacts across job runs for Python Dataflow jobs (BEAM-13459). To enable, set pipeline option: --enable_artifact_caching, this will be enabled by default in a future release.
Breaking Changes
Updated the jedis from 3.x to 4.x to Java RedisIO. If you are using RedisIO and using jedis directly, please refer to this page to update it. (BEAM-12092).
Datatype of timestamp fields in SqsMessage for AWS IOs for SDK v2 was changed from String to long, visibility of all fields was fixed from package private to publicBEAM-13638.
Properly check output timestamps on elements output from DoFns, timers, and onWindowExpiration in Java BEAM-12931.
Fixed a bug with DeferredDataFrame.xs when used with a non-tuple key
(BEAM-13421).
Known Issues
Users may encounter an unexpected java.lang.ArithmeticException when outputting a timestamp
for an element further than allowedSkew from an allowed DoFN skew set to a value more than
Integer.MAX_VALUE.
Support for stopReadTime on KafkaIO SDF (Java).(BEAM-13171).
Added ability to register URI schemes to use the S3 protocol via FileIO using amazon-web-services2 (amazon-web-services already had this ability). (BEAM-12435, BEAM-13245).
New Features / Improvements
Added support for cloudpickle as a pickling library for Python SDK (BEAM-8123). To use cloudpickle, set pipeline option: --pickler_lib=cloudpickle
Added option to specify triggering frequency when streaming to BigQuery (Python) (BEAM-12865).
Added option to enable caching uploaded artifacts across job runs for Python Dataflow jobs (BEAM-13459). To enable, set pipeline option: --enable_artifact_caching, this will be enabled by default in a future release.
Breaking Changes
Updated the jedis from 3.x to 4.x to Java RedisIO. If you are using RedisIO and using jedis directly, please refer to this page to update it. (BEAM-12092).
Datatype of timestamp fields in SqsMessage for AWS IOs for SDK v2 was changed from String to long, visibility of all fields was fixed from package private to publicBEAM-13638.
Bugfixes
Properly check output timestamps on elements output from DoFns, timers, and onWindowExpiration in Java BEAM-12931.
Fixed a bug with DeferredDataFrame.xs when used with a non-tuple key
(BEAM-13421).
Known Issues
Users may encounter an unexpected java.lang.ArithmeticException when outputting a timestamp
for an element further than allowedSkew from an allowed DoFN skew set to a value more than
Integer.MAX_VALUE.
Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.
Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR:
- `@dependabot rebase` will rebase this PR
- `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it
- `@dependabot merge` will merge this PR after your CI passes on it
- `@dependabot squash and merge` will squash and merge this PR after your CI passes on it
- `@dependabot cancel merge` will cancel a previously requested merge and block automerging
- `@dependabot reopen` will reopen this PR if it is closed
- `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
- `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
- `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Bumps apache-beam[gcp] from 2.35.0 to 2.36.0.
Release notes
Sourced from apache-beam[gcp]'s releases.
... (truncated)
Changelog
Sourced from apache-beam[gcp]'s changelog.
Commits
81f3a16
Set version for 2.36.0 RC33819f2a
[release-2.36.0][BEAM-13430] Fix provided configuration (#16704)36a5b0f
[release-2.36.0] Move xz licenses to manual licenses for Java containers (#16...3ccc462
[release-2.36.0][BEAM-13430] Revert Spark libraries in spark runner to provid...690926d
[release-2.36.0][BEAM-13781] Exclude grpc-netty-shaded from gax-grpc's depend...ccc0741
[BEAM-11648] Fix exception with BigQuery StreamWriter TraceID. (#16616)a1f3f86
Merge pull request #16585 from emilymye/release-2.36.041f775d
reset version to 2.36.0c685d89
Merge commit 'e3c24f0e594d15121ea309806cc56306276d8e0a' into release-2.36.0e3c24f0
[BEAM-13430] Re-add provided configuration (#16552)Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting
@dependabot rebase
.Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)