Open helensilva14 opened 2 years ago
@reuvenlax I'm assigning this to you. Please do triage / acknowledge whether this makes sense.
This looks like you might have incompatible JARs linked into your binary.
Hi! Coming back to this issue later than expected. Do you mean if I run the pipeline with latest version now (2.40 instead of 2.39) I could get a different result? I'll try and reach out with the results.
It means that your build is probably pulling in a different version of the BigQuery library than the one used by Beam.
On Tue, Aug 16, 2022 at 10:35 AM Helen Cristina @.***> wrote:
Hi! Coming back to this issue later than expected. Do you mean if I run the pipeline with latest version now (2.40 instead of 2.39) I could get a different result? I'll try and reach out with the results.
— Reply to this email directly, view it on GitHub https://github.com/apache/beam/issues/21893#issuecomment-1216945446, or unsubscribe https://github.com/notifications/unsubscribe-auth/AFAYJVLK3KZBSKLEPHOCKBLVZPGNVANCNFSM5Y337EKA . You are receiving this because you were mentioned.Message ID: @.***>
The advice in these cases is to try the following:
mvn dependency:tree | grep bigquery
There you can find the different bigquery-related libraries that you have in your project. Once you've found those, you may find two examples of the same library with different versions - and you can fix this with dependencyManagement
- usually by choosing the later version.
You can also use -Dincludes=
to filter more intelligently than with grep
: https://maven.apache.org/plugins/maven-dependency-plugin/examples/filtering-the-dependency-tree.html
What happened?
Hello! Me and my team found a scenario where we needed to check if Beam can handle the dynamically creation of BQ tables with partitions using the new API.
Like the Spark BigQuery Connector, the Beam connector supports different ways of writing to BigQuery, currently having these write methods available:
The second and third ones make use of BigQuery Storage Write API (if using the Spark BQ connector, this would be the direct mode)
Regarding table partitions (default BQ time partitioning):
Regarding table partitions (custom columns):
Conclusion: it seems that Apache Beam connector implementation which uses the BigQuery Storage Write API has problems and limitations regarding table partitions.
The testing pipeline is provided as a Gist here. We hope this issue can be addressed and would be glad to help/validate. Thanks!
Issue Priority
Priority: 1
Issue Component
Component: io-java-gcp