projectnessie / nessie

Nessie: Transactional Catalog for Data Lakes with Git-like semantics
https://projectnessie.org
Apache License 2.0
989 stars 130 forks source link

[Bug]: Not able to query old snapshots or time travel on Iceberg snapshots with nessie #8991

Closed schobe closed 3 months ago

schobe commented 3 months ago

What happened

Hi,

I do not see a way to query the data from old snapshot of iceberg tables. If there is a change on Iceberg table, the snapshot id is changed and pointed to new snapshot id in the history. db.table.snapshots only shows the current snapshot id and parent id is null. When I tried to query from the older snapshot using select * from db.table version as of oldsnapshot_id, it says that could not find snapshot id.

Does Nessie provide a way to query from older snapshots if the snapshots versioning is maintained via RPS? Is there any documentation around this if it is not a bug?

Thanks & Regards

How to reproduce it

  1. Create tables using icebrg rest with Nessie
  2. spark.sql("select * from db.table.snapshots").show() -- get snapshot id as oldsnapshot_id
  3. spark.sql("insert into db.table values..").show() -- insert new records in the table
  4. spark.sql("select * from db.table.snapshots").show() -- get current snapshot_id
  5. spark.sql("select * from db.table.snapshots version as of oldsnapshot_id").show() -- check if we can time travel with old snapshot

Nessie server type (docker/uber-jar/built from source) and version

0.90.4

Client type (Ex: UI/Spark/pynessie ...) and version

No response

Additional information

No response

snazy commented 3 months ago

This is not a bug but intentional behavior. The versioning in Iceberg and Nessie are different. While Iceberg has only single table are consistency, it can by its design not guarantee consistency across multiple tables and views and namespaces. Nessie can guarantee consistency across multiple entities - that's actually the central design principle. Nessie uses the "Git model" and allows "branch and merge".

The disconnect is that Iceberg's history is not necessarily the history of the table in Nessie's commit log. So with Iceberg REST we intentionally opted to expose only the snapshot that belongs to the particular Nessie commit.

You can however do time-travel with Nessie - either by specifying the commit ID or a timestamp. An example can be found here.

ajantha-bhat commented 3 months ago

@snazy: What if the user wants to move the table from Nessie catalog to another catalog? In that case when they register the latest table metadata, they can't do time travel because snapshot history is not maintained?

schobe commented 3 months ago

@snazy One more question, does this mean we can not do other data operations like expiring snapshot and orphan file clean up since the only current snapshot is maintained? What is another way for DataOps on Iceberg with Nessie approach?

dimas-b commented 3 months ago

What if the user wants to move the table from Nessie catalog to another catalog? In that case when they register the latest table metadata, they can't do time travel because snapshot history is not maintained?

I suppose Nessie could provide a tool to write the Iceberg snapshot lineage to metadata json starting with a particular commit for the purpose of migrating the table to another catalog.

This lineage will have to be limited to direct parents because the Iceberg format allows only a simple list of snapshots.

dimas-b commented 3 months ago

@schobe : Re: snapshot expiry and orphan files cleanup, please considering using the Nessie GC tool for that: https://projectnessie.org/nessie-latest/gc/

Nessie internally determines a particular snapshot's "liveness" bases on the full commit graph (which can be more complex than what the Iceberg format currently supports).