AbsaOSS / enceladus

Dynamic Conformance Engine
Apache License 2.0
29 stars 14 forks source link

Spark date handling settings can mess up dates in Standardization-Conformance combined job #2189

Open benedeki opened 1 year ago

benedeki commented 1 year ago

Describe the bug

While #2175/#2184 solves for the ability to process dates prior to 1900 in Enceladus 3 (without code change), the way how it is done actually introduced a possible bug of messing up the dates. In case a pair of different settings (LEGACY-CORRECTED) is used for read and write a combined job of Standardization&Conformance will mess up the dates.

To Reproduce

Steps to reproduce the behavior OR commands run:

  1. Have a dataset with timestamps pre 1900 from ambiguous interval
  2. Run Standardization & Conformance job with settings read LEGACY, write CORRECTED
  3. Dates will be messed up

Expected behavior

Data should remain correct

Additional context

Consider adding the information of used date reading standard into the _INFO file.