delta-io / delta

An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
https://delta.io
Apache License 2.0
7.62k stars 1.71k forks source link

Remove snapshotAnalysis from TahoeLogFileIndex #3722

Closed harperjiang closed 1 month ago

harperjiang commented 1 month ago

Which Delta project/connector is this regarding?

Description

This PR fixes the OOM caused by SparkSession.cloneSession and TemporaryView. It replaces the reference of Snapshot in TahoeLogFileIndex using SnapshotDescriptor, thus remove the reference to SparkSession from TahoeLogFileIndex.

How was this patch tested?

UT

Does this PR introduce any user-facing changes?

No