apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.89k stars 4.27k forks source link

[java] BQ: use logical type in avro schema factory on write #33163

Closed RustedBones closed 15 hours ago

RustedBones commented 1 week ago

when writing to BQ with avro, if the table schema contains DATE, TIME, TIMESTAMP columns, the default schema factory should create avro fields with matching logical type.

There is still an issue with DATETIME: BigQueryAvroUtils::toGenericAvroSchema favors generating schema for the reading side and generates an avro field with string(datetime) type. This can't be used on write (expecting long(local-timestamp-millis) or long(local-timestamp-micros)).

See note in doc

When exporting to Avro from BigQuery, DATETIME is exported as a STRING with a custom logical time that is not recognized as a DATETIME upon importing back into BigQuery.

github-actions[bot] commented 1 week ago

Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment assign set of reviewers

github-actions[bot] commented 1 week ago

Assigning reviewers. If you would like to opt out of this review, comment assign to next reviewer:

R: @Abacn for label java. R: @damondouglas for label io.

Available commands:

The PR bot will only process comments in the main thread (not review comments).

github-actions[bot] commented 1 day ago

Reminder, please take a look at this pr: @Abacn @damondouglas