apache / beam

Apache Beam is a unified programming model for Batch and Streaming data processing.
https://beam.apache.org/
Apache License 2.0
7.85k stars 4.25k forks source link

AvroUtils is converting incorrectly LogicalType Timestamps from long into Joda DateTimes #20205

Open damccorm opened 2 years ago

damccorm commented 2 years ago

Copied from the mailing list report:

I think the method AvroUtils.toBeamSchema has a not expected side effect. I found out that, if you invoke it and then you run a pipeline of GenericRecords containing a timestamp (l tried with logical-type timestamp-millis), Beam converts such timestamp from long to org.joda.time.DateTime. Even if you don't apply any transformation to the pipeline. Do you think it's a bug?

More details on how to reproduce here:

https://lists.apache.org/thread.html/r43fb2896e496b7493a962207eb3b95360abc30b9d091b26f110264d0%40%3Cuser.beam.apache.org%3E

Imported from Jira BEAM-9863. Original Jira may contain additional context. Reported by: iemejia.

slouc commented 2 years ago

I have the same problem, although I haven't noticed AvroUtils.toBeamSchema to be the culprit, but rather this conversion in the AvroUtils static block.

Seems like the same thing though. Here's a small repo that contains more info and demonstrates my problem: https://github.com/slouc/avro-beam-test.

I have also shown how it can cause runtime exceptions when building the avro record.

RustedBones commented 2 years ago

As stated in #16271

BEAM-9144 is a 'snowflake'. All other joda logical time conversions are still failing. (date, time-millis, time-micros, timestampt-micros). This should not be in the framework but in the user code

This should definitely be out the framework since this create side effect by simply having the class in scope