apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
5.87k stars 2.06k forks source link

is there anyway to rewrite onto a specific branch? #8762

Open zinking opened 9 months ago

zinking commented 9 months ago

Query engine

Spark

Question

I thought this might do

      val table = s"iceberg_catalog.${tableIdentifier}.branch_${branch}"
      val t = Spark3Util.loadIcebergTable(spark, table)
      val start = System.currentTimeMillis()
      try {
        SparkActions.get()
          .rewriteDataFiles(t)
          .skipPlanDeletes(skipPlanDeletes)
          .filter(Expressions.equal("ds", 20230923))
          .execute()

I was assuming the data is read from the branch, and the rewrite the result is written onto the branch

but it is not, seems the change is still visible on main.

rakesh-das08 commented 9 months ago

Can you once try using the rewrite data files spark procedure