delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
6.98k stars 1.6k forks source link

[Spark] InCommitTimestamp: Use clock.currentTimeMillis() instead of nanoTime() in commitLarge #3111

Closed dhruvarya-db closed 2 weeks ago

dhruvarya-db commented 2 weeks ago

Which Delta project/connector is this regarding?

Description

We currently use NANOSECONDS.toMillis(System.nanoTime()) for generating the ICT when commitLarge is called. However, this usage of System.nanoTime() is not correct as it should only be used for measuring time difference, not to get an approximate wall clock time. This leads to scenarios where the ICT becomes very small (e.g. 1 Jan 1970) sometimes because some systems return a very small number when System.nanoTime() is called. This PR changes this so that clock.getCurrentTimeMillis() is used instead.

How was this patch tested?

Added a test case to ensure that clock.getCurrentTimeMillis() is being used.

Does this PR introduce any user-facing changes?

No