delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
6.98k stars 1.6k forks source link

[Spark] Add time tracking for file changes and getting dataframe for Delta source with/without CDC #3090

Closed anishshri-db closed 2 weeks ago

anishshri-db commented 2 weeks ago

Which Delta project/connector is this regarding?

Spark

Description

Add time tracking for file changes and getting dataframe for Delta source with/without CDC

How was this patch tested?

Existing unit tests

Does this PR introduce any user-facing changes?

No

anishshri-db commented 2 weeks ago

Test failures are unrelated -

[info] *** 36 TESTS FAILED ***
[error] Failed: Total 11802, Failed 36, Errors 0, Passed 11766, Ignored 3317, Canceled 5
[error] Failed tests:
[error]     org.apache.spark.sql.delta.DeltaColumnRenameSuite
[error]     org.apache.spark.sql.delta.GeneratedColumnSuite
[error]     org.apache.spark.sql.delta.perf.OptimizeGeneratedColumnSuite

I see the same on other PR runs too - https://github.com/delta-io/delta/actions/runs/9085533374/job/24969089176?pr=3091

tdas commented 2 weeks ago

I think the spark master test failures are independent of this PR. merging this.