Open damnMeddlingKid opened 4 months ago
https://beam.apache.org/documentation/io/built-in/google-bigquery/#writing-to-a-table
The internal Runner V2 work has been going on to resolve this issue soon.
For now, you can disable Runner V2 if possible.
cc @scwhittle
There is a public gcp issue tracking this as well.
Beyond using v1, another mitigation allowing the use of .withMethod(STORAGE_WRITE_API) with the v2 runner is to both:
You can refer to this blog-post for some guidance on setting the # of streams if you are disabling autosharding.
What happened?
We are attempting to use the STORAGE_WRITE_API with exactly-once guarantees in our pipelines running on Runner V2. Our configuration uses dynamic destinations and auto sharding, as detailed below:
Issue Encountered
When we run our pipeline on runner V2 with the above BigQueryIO configuration we get the following error
The pipeline executes successfully when we modify the configuration to use a static number of write streams (withNumStorageWriteApiStreams(40)) instead of auto sharding.
While looking for references on this issue I found https://partnerissuetracker.corp.google.com/issues/271105510 which claims that auto sharding should work on Runner V2.
Questions
STORAGE_WRITE_API
on runner V2STORAGE_WRITE_API
support on runner V2 ?, im struggling to find an issue or documentation on this.Issue Priority
Priority: 2 (default / most bugs should be filed as P2)
Issue Components