Open sduck opened 7 years ago
@sduck, thanks for your report. Indeed, this is an issue we are looking at.
@sammcveety, can you perhaps comment more?
@sduck this is a limitation in the current SDK, documented at https://cloud.google.com/dataflow/docs/templates/creating-templates#pipeline-io-and-runtime-parameters. We are working to remove this restriction in future releases.
Thanks for the updates - looking forward to a solution on this restriction.
Hi @sammcveety how far off would a <2.0 release be?
If you mean >=2.0, there is already a 2.0beta2 out. https://github.com/apache/beam/pull/2123 addresses BQ.Read.
On Mon, Mar 13, 2017 at 3:39 PM, Paul Findlay notifications@github.com wrote:
Hi @sammcveety https://github.com/sammcveety our far off would a <2.0 release be?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/550#issuecomment-286265984, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQVgVE6k2dLP_gnQyvC0wLuA_cD0s53ks5rlcWTgaJpZM4MAlvx .
@sammcveety If we are talking 2.0, when would a GA release be expected for support, business approval etc.? But will there be a backport of the bugfix in the dataflow 1.9.x sdk?
I believe a tentative date of Q2 was announced at Next17. There are no plans for a backport to 1.9.
On Thu, Mar 16, 2017 at 2:56 PM, Paul Findlay notifications@github.com wrote:
@sammcveety https://github.com/sammcveety If we are talking 2.0, when would a GA release be expected for support, business approval etc.? But will there be a backport of the bugfix in the dataflow 1.9.x sdk?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/550#issuecomment-287203256, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQVgRLbv_KK3nAfkg9EmjbN7L4hpxd2ks5rmbAWgaJpZM4MAlvx .
I'm having this exact problem @sammcveety with beam-sdks-java-io-google-cloud-platform 0.6.0 . Can't wait for the solution ;)
https://github.com/apache/beam/pull/2123 in progress
On Thu, Apr 27, 2017 at 10:05 AM, kmaillet-arcane notifications@github.com wrote:
I'm having this exact problem @sammcveety https://github.com/sammcveety . Can't wait for the solution ;)
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/550#issuecomment-297777562, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQVgYvLz-rFWmvlC49RwXB5k1xUIaslks5r0MrKgaJpZM4MAlvx .
Just wanted to add that the cloud console will list such a repeated batch job as successful, despite having no output (for the bq load step in any case). Although I should have read the documentation more carefully (mea culpa), it had me confused for some time (until I finally landed here). Hope we'll see 2.0 soon, will work around it with TextIO & a separate load in the mean time I suppose.
It would be a huge pity not to back port to 1.9.x, since the templating feature exists and is pretty much paralyzed for BQ where it's most useful. What's the logic behind not doing so? Is it more work than everyone porting to >2.0.
There are no plans to make further enhancements to 1.9. Upgrading to 2.0 should be relatively easy comma please let us know if you encounter issues.
On Jun 14, 2017 1:27 AM, "domparry" notifications@github.com wrote:
It would be a huge pity not to back port to 1.9.x, since the templating feature exists and is pretty much paralyzed for BQ where it's most useful. What's the logic behind not doing so? Is it more work than everyone porting to >2.0.
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/GoogleCloudPlatform/DataflowJavaSDK/issues/550#issuecomment-308360166, or mute the thread https://github.com/notifications/unsubscribe-auth/ACQVga3BsEJUvoIGD46SMB5ZR075W6fIks5sD5lzgaJpZM4MAlvx .
It seems this problem still exists with v2.0.0. The document for SDK 2.X says
"* For BigQuery batch pipelines, templates can only be executed once, as the BigQuery job ID is set at template creation time. This restriction will be removed in a future release.".
What's the point of supporting templates if it can only be executed once? It would be better to say it doesn't support templating. I wasted my several hours because I missed that fine-print.
Is there any time estimation for implementing this yet?
Its fixed in beam sdk 2.3.0, https://cloud.google.com/dataflow/docs/templates/creating-templates#pipeline-io-and-runtime-parameters
I am having same issue with Beam version 2.25.0
I'm doing a simple batch-job, that I'm implementing as a template. It is supposed to read data from BigQuery. Everything works fine on the first run, but all subsequent executions of template results in an error from BigQuery service: "Request failed with code 409, will NOT retry: https://www.googleapis.com/bigquery/v2/projects/boozt-ga/jobs"
I can see the all executions ends up giving the BigQuery extract job the exact same jobid and that seems to be the reason that BigQuery fails.