GoogleCloudPlatform / DataflowTemplates

Cloud Dataflow Google-provided templates for solving in-Cloud data tasks
https://cloud.google.com/dataflow/docs/guides/templates/provided-templates
Apache License 2.0
1.16k stars 978 forks source link

[KafkaToBigQueryFlex Template]: Add support for Avro ENUM type and fix FLOAT type. #2029

Open an2x opened 4 days ago

an2x commented 4 days ago

Using Avro format with an ENUM type field in the KafkaToBigQueryFlex template currently causes the following error:

Expected Avro schema types [STRING] for BigQuery STRING field operation, but received ENUM

Avro FLOAT type also wasn't handled correctly, resulting in the following error:

Expected Avro schema types [DOUBLE, INT] for BigQuery FLOAT64 field <...>, but received FLOAT

This PR should fix both.

codecov[bot] commented 4 days ago

Codecov Report

Attention: Patch coverage is 0% with 12 lines in your changes missing coverage. Please review.

Project coverage is 45.40%. Comparing base (b04de34) to head (23ce31e). Report is 2 commits behind head on main.

Files with missing lines Patch % Lines
...gle/cloud/teleport/v2/utils/BigQueryAvroUtils.java 0.00% 12 Missing :warning:
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #2029 +/- ## ============================================ - Coverage 45.42% 45.40% -0.03% - Complexity 3678 4003 +325 ============================================ Files 842 843 +1 Lines 49970 49999 +29 Branches 5261 5264 +3 ============================================ + Hits 22697 22700 +3 - Misses 25605 25628 +23 - Partials 1668 1671 +3 ``` | [Components](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/components?src=pr&el=components&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | Coverage Δ | | |---|---|---| | [spanner-templates](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `66.71% <ø> (-0.02%)` | :arrow_down: | | [spanner-import-export](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `64.16% <ø> (-0.07%)` | :arrow_down: | | [spanner-live-forward-migration](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `75.88% <ø> (ø)` | | | [spanner-live-reverse-replication](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `76.65% <ø> (ø)` | | | [spanner-bulk-migration](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/components?src=pr&el=component&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | `86.37% <ø> (ø)` | | | [Files with missing lines](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029?dropdown=coverage&src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform) | Coverage Δ | | |---|---|---| | [...gle/cloud/teleport/v2/utils/BigQueryAvroUtils.java](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029?src=pr&el=tree&filepath=v2%2Fkafka-to-bigquery%2Fsrc%2Fmain%2Fjava%2Fcom%2Fgoogle%2Fcloud%2Fteleport%2Fv2%2Futils%2FBigQueryAvroUtils.java&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform#diff-djIva2Fma2EtdG8tYmlncXVlcnkvc3JjL21haW4vamF2YS9jb20vZ29vZ2xlL2Nsb3VkL3RlbGVwb3J0L3YyL3V0aWxzL0JpZ1F1ZXJ5QXZyb1V0aWxzLmphdmE=) | `0.00% <0.00%> (ø)` | | ... and [6 files with indirect coverage changes](https://app.codecov.io/gh/GoogleCloudPlatform/DataflowTemplates/pull/2029/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=GoogleCloudPlatform)