The total scan time reported by Comet is often less than the reported time to decode Parquet data, which does not make sense.
The issue is that we convert nano time to milliseconds for each batch and this loses a lot of precision. In one example, the actual total scan time was 41 seconds but it was reported as 23 seconds, which is very misleading. Spark also suffers from this problem.
Which issue does this PR close?
Closes https://github.com/apache/datafusion-comet/issues/914
Rationale for this change
The total scan time reported by Comet is often less than the reported time to decode Parquet data, which does not make sense.
The issue is that we convert nano time to milliseconds for each batch and this loses a lot of precision. In one example, the actual total scan time was 41 seconds but it was reported as 23 seconds, which is very misleading. Spark also suffers from this problem.
Before
After
What changes are included in this PR?
Change scan time to be recorded in nanos.
How are these changes tested?
Manual testing. See earlier screenshots.