Open rcaudy opened 6 months ago
@lbooker42 We need to do two things:
aj
to ensure that we only reinterpret one side if we can reinterpret both sides.ZonedDataTime
in naturalJoin
and join
IF we can reinterpret both sides.ZonedDateTime
in whereIn
, unless we can refactor to know the set table and the source table at the same time.https://github.com/deephaven/deephaven-core/pull/5780#discussion_r1684961442 We may need to revisit column stats for ZDTs.
Firstly, the over-arching issues:
ZonedDateTime
instances are not equal to otherZonedDateTime
instances with different time zones, and so when we do convert to along
for hashing or sorting we are changing the definition of equals that applies.ReinterpretUtils.maybeConvertToPrimitive
rightlfully requires support for symmetric conversion from theColumnSource
to be reinterpreted (i.e.source instanceof ConvertibleTimeSource.Zoned
) and thus a single time zone for the entireColumnSource
.Secondly, the consistency issues:
naturalJoin
andjoin
are never reinterpretingZonedDateTime
key sources tolong
for hashing. Hence they are always applyingZonedDateTime.hashCode()
andZonedDateTime.equals()
. This is internally consistent within the operations, at least, but inconsistent with ~other operations~aj
.aj
is usingReinterpretUtils.maybeConvertToPrimitive
, and so we might reinterpret only one side, resulting in an error that seems inexplicable to users. We will also be getting equality "wrong" (by the Java definition) if the two sides have different time zones.aggBy
converts symmetrically, which is good. ~However, it is effectively picking different definitions of equality depending on the provenance of theZonedDateTime
key source.~ With currentmaybeConvertToPrimitive
, we're always correct.sort
has basically the same issue asaggBy
. ~While we're always using a comparison that is consistent with equality, which definition of comparison and equality we use depends on the source.~ With currentmaybeConvertToPrimitive
, we're always correct.Solutions:
ZonedDateTime
support frommaybeConvertToPrimitive
. This standardizes on Java's definition of comparison and equality, but eliminates opportunities for optimization.ConvertibleTimeSource.Zoned.getZone()
results, we need to error out or "pick a winner". Worse, if we have to convert back from an un-zonedlong
source, we need to pick a zone.DateTimeUtils.timeZone()
, e.g. the system default?maybeConvertToPrimitive
the same. For joins, we should only convert if both sides are reinterpretable and have the same fixed zone.~I think we should pick option (2), as that renders it less fraught to reinterpret between
Instant
andZonedDateTime
. Otherwise, this reinterpretation changes the meaning of equality, etc, for the data within the column.~I think we should pick option (3). It means zone matters for equality and comparison. It keeps
aggBy
andsort
correct with current optimizations. We could eventually optimizenaturalJoin
andjoin
, but they are correct as-is. We would have to fix a bug inaj
that might result in error messages or incorrect equality/comparability.