trinodb / trino

Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
https://trino.io
Apache License 2.0
9.83k stars 2.85k forks source link

Improve speed when listing transaction logs during time travel in Delta Lake #21366

Open ebyhr opened 2 months ago

ebyhr commented 2 months ago

Follow-up of https://github.com/trinodb/trino/pull/21052

Also, we should verify the count of file listing when addressing this issue.

findinpath commented 2 months ago

Potential reference

https://github.com/delta-io/delta/commit/5ae57cc1ea58efd8b1e6cbbe13fbaeb51a231c4f

https://docs.google.com/document/d/13Nock1I8-143Dwidj8rMpgt3wAicrOI2OvDJt3OufOQ/edit#heading=h.3rtb99srgnl1