apache / iceberg

Apache Iceberg
https://iceberg.apache.org/
Apache License 2.0
6.55k stars 2.26k forks source link

EMR 6.10.0 Cannot migrate a table from a non-Iceberg Spark Session Catalog. Found spark_catalog #7317

Closed dthauvin closed 2 months ago

dthauvin commented 1 year ago

Apache Iceberg version

1.1.0

Query engine

EMR

Please describe the bug 🐞

Hello

i'am currently using SPARK on emr-6.10.0 version witch embedded Iceberg 1.1.0-amzn-0.

I tried to running the migrate procedure to replace an existing table with and iceberg table but i get the following error :

code :

spark.sql("""
CALL glue.system.migrate('spark_catalog.nyc.taxis_origin')
""")

stack trace:

An error was encountered:
Cannot migrate a table from a non-Iceberg Spark Session Catalog. Found spark_catalog of class org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog as the source catalog.
Traceback (most recent call last):
  File "/mnt/yarn/usercache/livy/appcache/application_1681132811497_0003/container_1681132811497_0003_01_000001/pyspark.zip/pyspark/sql/session.py", line 1034, in sql
    return DataFrame(self._jsparkSession.sql(sqlQuery), self)
  File "/mnt/yarn/usercache/livy/appcache/application_1681132811497_0003/container_1681132811497_0003_01_000001/py4j-0.10.9.5-src.zip/py4j/java_gateway.py", line 1322, in __call__
    answer, self.gateway_client, self.target_id, self.name)
  File "/mnt/yarn/usercache/livy/appcache/application_1681132811497_0003/container_1681132811497_0003_01_000001/pyspark.zip/pyspark/sql/utils.py", line 196, in deco
    raise converted from None
pyspark.sql.utils.IllegalArgumentException: Cannot migrate a table from a non-Iceberg Spark Session Catalog. Found spark_catalog of class org.apache.spark.sql.execution.datasources.v2.V2SessionCatalog as the source catalog.

My EMR configuration has these following parameters for spark-defaults configuration :

{
    "spark.sql.catalog.glue": "org.apache.iceberg.spark.SparkCatalog"
    "spark.sql.catalog.glue.warehouse": "s3://mybucket/warehouseiceberg/"
    "spark.sql.catalog.glue.catalog-impl": "org.apache.iceberg.aws.glue.GlueCatalog"
    "spark.sql.catalog.glue.io-impl": "org.apache.iceberg.aws.s3.S3FileIO"
    "spark.sql.catalog.glue.lock.table": "mytable-iceberg-lock"
    "spark.sql.catalog.glue.lakeformation-enabled": "false"
    "spark.sql.extensions": "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions"
  }

and for iceberg emr configuration :

{
    "Classification":"iceberg-defaults",
    "Properties":{"iceberg.enabled":"true"}
}

by the way the add_files method works well.

spark.sql("""
CREATE TABLE IF NOT EXISTS glue.nyc.taxis_migrated(
  `VendorID` BIGINT,
  `tpep_pickup_datetime` TIMESTAMP,
  `tpep_dropoff_datetime` TIMESTAMP,
  `passenger_count` DOUBLE,
  `trip_distance` DOUBLE,
  `RatecodeID` DOUBLE,
  `store_and_fwd_flag` STRING,
  `PULocationID` BIGINT,
  `DOLocationID` BIGINT,
  `payment_type` BIGINT,
  `fare_amount` DOUBLE,
  `extra` DOUBLE,
  `mta_tax` DOUBLE,
  `tip_amount` DOUBLE,
  `tolls_amount` DOUBLE,
  `improvement_surcharge` DOUBLE,
  `total_amount` DOUBLE,
  `congestion_surcharge` DOUBLE,
  `airport_fee` DOUBLE)
USING iceberg
LOCATION 's3://mybucket/tripdata_migrated/'
TBLPROPERTIES(
  'write.parquet.compression-codec'='snappy'
)
""")

spark.sql("""CALL glue.system.add_files(
  table => 'nyc.taxis_migrated',
  source_table => 'spark_catalog.nyc.taxis_origin'
)""")

DataFrame[added_files_count: bigint]
singhpk234 commented 1 year ago

I think you might wanna configure the spark catalog to SparkSessionCatalog as well, considering we require it to load an non-iceberg table.

can you please try migrate with the configuration below :

spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog
dthauvin commented 1 year ago

Hello after configuring

spark.sql.catalog.spark_catalog=org.apache.iceberg.spark.SparkSessionCatalog

i get the following error :


An error was encountered:
An error occurred while calling o119.sql.
: org.apache.iceberg.hive.RuntimeMetaException: Failed to connect to Hive Metastore
    at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:84)
    at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:34)
    at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:125)
    at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:56)
    at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
    at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:82)
    at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:223)
    at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
    at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
    at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:44)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
    at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
    at org.apache.iceberg.CachingCatalog.loadTable(CachingCatalog.java:166)
    at org.apache.iceberg.spark.SparkCatalog.load(SparkCatalog.java:608)
    at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:145)
    at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:311)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$3(Analyzer.scala:1198)
    at scala.Option.orElse(Option.scala:447)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$1(Analyzer.scala:1197)
    at scala.Option.orElse(Option.scala:447)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupRelation(Analyzer.scala:1189)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1060)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1024)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:179)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1282)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1281)
    at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:227)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1282)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1281)
    at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:227)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1282)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1281)
    at org.apache.spark.sql.catalyst.plans.logical.GlobalLimit.mapChildren(basicLogicalOperators.scala:1625)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:1024)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:983)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:215)
    at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
    at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
    at scala.collection.immutable.List.foldLeft(List.scala:91)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:212)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$6(RuleExecutor.scala:284)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor$RuleExecutionContext$.withContext(RuleExecutor.scala:327)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$5(RuleExecutor.scala:284)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$5$adapted(RuleExecutor.scala:274)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:274)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:188)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:227)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:223)
    at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:172)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:223)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:187)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:208)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:207)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:79)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:214)
    at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:554)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:214)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:213)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:79)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:77)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:69)
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:101)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:99)
    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:750)
Caused by: java.lang.RuntimeException: Unable to instantiate org.apache.hadoop.hive.metastore.HiveMetaStoreClient
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1742)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:87)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:137)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:108)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:60)
    at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:72)
    at org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:185)
    at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:63)
    ... 113 more
Caused by: java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1740)
    ... 125 more
Caused by: MetaException(message:Unable to open a test connection to the given database. JDBC url = jdbc:derby:;databaseName=/var/lib/hadoop/metastore/metastore_db;create=true, username = APP. Terminating connection pool (set lazyInit to true if you expect to start your database after your app). Original Exception: ------
java.sql.SQLException: Failed to create database '/var/lib/hadoop/metastore/metastore_db', see the next exception for details.
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.getSQLException(Unknown Source)
    at org.apache.derby.impl.jdbc.Util.seeNextException(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.createDatabase(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection.<init>(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver$1.run(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver$1.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.jdbc.InternalDriver.getNewEmbedConnection(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
    at org.apache.derby.jdbc.InternalDriver.connect(Unknown Source)
    at org.apache.derby.jdbc.AutoloadedDriver.connect(Unknown Source)
    at java.sql.DriverManager.getConnection(DriverManager.java:664)
    at java.sql.DriverManager.getConnection(DriverManager.java:208)
    at com.jolbox.bonecp.BoneCP.obtainRawInternalConnection(BoneCP.java:361)
    at com.jolbox.bonecp.BoneCP.<init>(BoneCP.java:416)
    at com.jolbox.bonecp.BoneCPDataSource.getConnection(BoneCPDataSource.java:120)
    at org.datanucleus.store.rdbms.ConnectionFactoryImpl$ManagedConnectionImpl.getConnection(ConnectionFactoryImpl.java:483)
    at org.datanucleus.store.rdbms.RDBMSStoreManager.<init>(RDBMSStoreManager.java:297)
    at sun.reflect.GeneratedConstructorAccessor88.newInstance(Unknown Source)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.datanucleus.plugin.NonManagedPluginRegistry.createExecutableExtension(NonManagedPluginRegistry.java:606)
    at org.datanucleus.plugin.PluginManager.createExecutableExtension(PluginManager.java:301)
    at org.datanucleus.NucleusContextHelper.createStoreManagerForProperties(NucleusContextHelper.java:133)
    at org.datanucleus.PersistenceNucleusContextImpl.initialise(PersistenceNucleusContextImpl.java:422)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.freezeConfiguration(JDOPersistenceManagerFactory.java:817)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.createPersistenceManagerFactory(JDOPersistenceManagerFactory.java:334)
    at org.datanucleus.api.jdo.JDOPersistenceManagerFactory.getPersistenceManagerFactory(JDOPersistenceManagerFactory.java:213)
    at sun.reflect.GeneratedMethodAccessor67.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at javax.jdo.JDOHelper$16.run(JDOHelper.java:1975)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.jdo.JDOHelper.invoke(JDOHelper.java:1970)
    at javax.jdo.JDOHelper.invokeGetPersistenceManagerFactoryOnImplementation(JDOHelper.java:1177)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:814)
    at javax.jdo.JDOHelper.getPersistenceManagerFactory(JDOHelper.java:702)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPMF(ObjectStore.java:526)
    at org.apache.hadoop.hive.metastore.ObjectStore.getPersistenceManager(ObjectStore.java:555)
    at org.apache.hadoop.hive.metastore.ObjectStore.initializeHelper(ObjectStore.java:410)
    at org.apache.hadoop.hive.metastore.ObjectStore.initialize(ObjectStore.java:347)
    at org.apache.hadoop.hive.metastore.ObjectStore.setConf(ObjectStore.java:308)
    at org.apache.hadoop.util.ReflectionUtils.setConf(ReflectionUtils.java:79)
    at org.apache.hadoop.util.ReflectionUtils.newInstance(ReflectionUtils.java:139)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.<init>(RawStoreProxy.java:58)
    at org.apache.hadoop.hive.metastore.RawStoreProxy.getProxy(RawStoreProxy.java:67)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.newRawStoreForConf(HiveMetaStore.java:627)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMSForConf(HiveMetaStore.java:593)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.getMS(HiveMetaStore.java:587)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.createDefaultDB(HiveMetaStore.java:658)
    at org.apache.hadoop.hive.metastore.HiveMetaStore$HMSHandler.init(HiveMetaStore.java:430)
    at sun.reflect.GeneratedMethodAccessor89.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invokeInternal(RetryingHMSHandler.java:148)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.invoke(RetryingHMSHandler.java:107)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.<init>(RetryingHMSHandler.java:79)
    at org.apache.hadoop.hive.metastore.RetryingHMSHandler.getProxy(RetryingHMSHandler.java:92)
    at org.apache.hadoop.hive.metastore.HiveMetaStore.newRetryingHMSHandler(HiveMetaStore.java:6950)
    at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.<init>(HiveMetaStoreClient.java:162)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at org.apache.hadoop.hive.metastore.MetaStoreUtils.newInstance(MetaStoreUtils.java:1740)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.<init>(RetryingMetaStoreClient.java:87)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:137)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:108)
    at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.getProxy(RetryingMetaStoreClient.java:101)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.iceberg.common.DynMethods$UnboundMethod.invokeChecked(DynMethods.java:60)
    at org.apache.iceberg.common.DynMethods$UnboundMethod.invoke(DynMethods.java:72)
    at org.apache.iceberg.common.DynMethods$StaticMethod.invoke(DynMethods.java:185)
    at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:63)
    at org.apache.iceberg.hive.HiveClientPool.newClient(HiveClientPool.java:34)
    at org.apache.iceberg.ClientPoolImpl.get(ClientPoolImpl.java:125)
    at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:56)
    at org.apache.iceberg.ClientPoolImpl.run(ClientPoolImpl.java:51)
    at org.apache.iceberg.hive.CachedClientPool.run(CachedClientPool.java:82)
    at org.apache.iceberg.hive.HiveTableOperations.doRefresh(HiveTableOperations.java:223)
    at org.apache.iceberg.BaseMetastoreTableOperations.refresh(BaseMetastoreTableOperations.java:97)
    at org.apache.iceberg.BaseMetastoreTableOperations.current(BaseMetastoreTableOperations.java:80)
    at org.apache.iceberg.BaseMetastoreCatalog.loadTable(BaseMetastoreCatalog.java:44)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.lambda$doComputeIfAbsent$14(BoundedLocalCache.java:2406)
    at java.util.concurrent.ConcurrentHashMap.compute(ConcurrentHashMap.java:1853)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.doComputeIfAbsent(BoundedLocalCache.java:2404)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:2387)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:108)
    at org.apache.iceberg.shaded.com.github.benmanes.caffeine.cache.LocalManualCache.get(LocalManualCache.java:62)
    at org.apache.iceberg.CachingCatalog.loadTable(CachingCatalog.java:166)
    at org.apache.iceberg.spark.SparkCatalog.load(SparkCatalog.java:608)
    at org.apache.iceberg.spark.SparkCatalog.loadTable(SparkCatalog.java:145)
    at org.apache.spark.sql.connector.catalog.CatalogV2Util$.loadTable(CatalogV2Util.scala:311)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$3(Analyzer.scala:1198)
    at scala.Option.orElse(Option.scala:447)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.$anonfun$lookupRelation$1(Analyzer.scala:1197)
    at scala.Option.orElse(Option.scala:447)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.org$apache$spark$sql$catalyst$analysis$Analyzer$ResolveRelations$$lookupRelation(Analyzer.scala:1189)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1060)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$$anonfun$apply$13.applyOrElse(Analyzer.scala:1024)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$3(AnalysisHelper.scala:138)
    at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(TreeNode.scala:179)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:138)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1282)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1281)
    at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:227)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1282)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1281)
    at org.apache.spark.sql.catalyst.plans.logical.OrderPreservingUnaryNode.mapChildren(LogicalPlan.scala:227)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$2(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren(TreeNode.scala:1282)
    at org.apache.spark.sql.catalyst.trees.UnaryLike.mapChildren$(TreeNode.scala:1281)
    at org.apache.spark.sql.catalyst.plans.logical.GlobalLimit.mapChildren(basicLogicalOperators.scala:1625)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.$anonfun$resolveOperatorsUpWithPruning$1(AnalysisHelper.scala:135)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:323)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning(AnalysisHelper.scala:134)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.resolveOperatorsUpWithPruning$(AnalysisHelper.scala:130)
    at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.resolveOperatorsUpWithPruning(LogicalPlan.scala:31)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:1024)
    at org.apache.spark.sql.catalyst.analysis.Analyzer$ResolveRelations$.apply(Analyzer.scala:983)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$1(RuleExecutor.scala:215)
    at scala.collection.LinearSeqOptimized.foldLeft(LinearSeqOptimized.scala:126)
    at scala.collection.LinearSeqOptimized.foldLeft$(LinearSeqOptimized.scala:122)
    at scala.collection.immutable.List.foldLeft(List.scala:91)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeBatch$1(RuleExecutor.scala:212)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$6(RuleExecutor.scala:284)
    at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor$RuleExecutionContext$.withContext(RuleExecutor.scala:327)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$5(RuleExecutor.scala:284)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$execute$5$adapted(RuleExecutor.scala:274)
    at scala.collection.immutable.List.foreach(List.scala:431)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:274)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.execute(RuleExecutor.scala:188)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.org$apache$spark$sql$catalyst$analysis$Analyzer$$executeSameContext(Analyzer.scala:227)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$execute$1(Analyzer.scala:223)
    at org.apache.spark.sql.catalyst.analysis.AnalysisContext$.withNewAnalysisContext(Analyzer.scala:172)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:223)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.execute(Analyzer.scala:187)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.$anonfun$executeAndTrack$1(RuleExecutor.scala:179)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:107)
    at org.apache.spark.sql.catalyst.rules.RuleExecutor.executeAndTrack(RuleExecutor.scala:179)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.$anonfun$executeAndCheck$1(Analyzer.scala:208)
    at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.markInAnalyzer(AnalysisHelper.scala:330)
    at org.apache.spark.sql.catalyst.analysis.Analyzer.executeAndCheck(Analyzer.scala:207)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$analyzed$1(QueryExecution.scala:79)
    at org.apache.spark.sql.catalyst.QueryPlanningTracker.measurePhase(QueryPlanningTracker.scala:192)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$2(QueryExecution.scala:214)
    at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:554)
    at org.apache.spark.sql.execution.QueryExecution.$anonfun$executePhase$1(QueryExecution.scala:214)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.execution.QueryExecution.executePhase(QueryExecution.scala:213)
    at org.apache.spark.sql.execution.QueryExecution.analyzed$lzycompute(QueryExecution.scala:79)
    at org.apache.spark.sql.execution.QueryExecution.analyzed(QueryExecution.scala:77)
    at org.apache.spark.sql.execution.QueryExecution.assertAnalyzed(QueryExecution.scala:69)
    at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:101)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:99)
    at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:622)
    at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:779)
    at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:617)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at py4j.reflection.MethodInvoker.invoke(MethodInvoker.java:244)
    at py4j.reflection.ReflectionEngine.invoke(ReflectionEngine.java:357)
    at py4j.Gateway.invoke(Gateway.java:282)
    at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:132)
    at py4j.commands.CallCommand.execute(CallCommand.java:79)
    at py4j.GatewayConnection.run(GatewayConnection.java:238)
    at java.lang.Thread.run(Thread.java:750)
Caused by: ERROR XJ041: Failed to create database '/var/lib/hadoop/metastore/metastore_db', see the next exception for details.
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.impl.jdbc.SQLExceptionFactory.wrapArgsForTransportAcrossDRDA(Unknown Source)
    ... 191 more
Caused by: ERROR XBM0H: Directory /mnt/var/lib/hadoop/metastore/metastore_db cannot be created.
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.iapi.error.StandardException.newException(Unknown Source)
    at org.apache.derby.impl.services.monitor.StorageFactoryService$10.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.services.monitor.StorageFactoryService.createServiceRoot(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.bootService(Unknown Source)
    at org.apache.derby.impl.services.monitor.BaseMonitor.createPersistentService(Unknown Source)
    at org.apache.derby.impl.services.monitor.FileMonitor.createPersistentService(Unknown Source)
    at org.apache.derby.iapi.services.monitor.Monitor.createPersistentService(Unknown Source)
    at org.apache.derby.impl.jdbc.EmbedConnection$5.run(Unknown Source)
    at java.security.AccessController.doPrivileged(Native Method)
    at org.apache.derby.impl.jdbc.EmbedConnection.createPersistentService(Unknown Source)
    ... 188 more
------ ```
wood-dunamu commented 1 year ago

@dthauvin Did you solve that problem?

matthijseikelenboom commented 1 year ago

I also get this error. The situation is as follows: We have a Spark data lake with data in it and now want to apply Iceberg to it. When running this migrate procedure when trying to change Spark parquet tables to Iceberg tables, we also get this error.

@singhpk234 Are you saying I should add that line to the config or should I change my current spark.sql.catalog.iceberghadoop to org.apache.iceberg.spark.SparkSessionCatalog?

tomtongue commented 1 year ago

Sorry for jumping in. I personally investigated the migrate query issue for GlueCatalog, so let me share my investigation result.

Result

Currently, it’s NOT possible to run migrate query for Spark/Hive tables in Glue Data Catalog. The reason of this is that GlueCatalog client doesn’t support renaming tables currently. Let me elaborate that below. If I’m wrong, please correct me.

Details

When running the migrate query for a Spark/Hive table in Glue Data Catalog, as described above, the SparkSessionCatalog configuration should be specified like .config("spark.sql.catalog.spark_catalog", "org.apache.iceberg.spark.SparkSessionCatalog").

In this case, the source table in the migrate query like table => '$db.$table' is always set to spark_catalog (if other catalog is specified, the Spark application will fail).

For this, in the current design of migrate , the code path always goes through SparkSessionCatalog.renameTable because as its specification, the migrate creates a staging table, renames the source table to keep the table as backup, and then migrate the source table to Iceberg. After the migration, the back table is dropped or not based on the drop_backup parameter. In the phase of renaming the source table to keep the backup table, the SparkSessionCatalog.renameTable is called.

The SparkSessionCatalog.renameTable can handle the IcebergCatalog to rename the table in GlueCatalog, the method basically checks the source table and if the source table is Iceberg, then calls IcebergCatalog.renameTable (GlueCatalogImpl is specified here, so the renameTable in GlueCatalogImpl will be used). However, in this case, the source table always belongs to spark_catalog, therefore the code path always goes to getSessionCatalog().renameTable as follows:

https://github.com/apache/iceberg/blob/apache-iceberg-1.4.1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java#L293

  @Override
  public void renameTable(Identifier from, Identifier to)
      throws NoSuchTableException, TableAlreadyExistsException {
    // rename is not supported by HadoopCatalog. to avoid UnsupportedOperationException for session
    // catalog tables,
    // check table existence first to ensure that the table belongs to the Iceberg catalog.
    if (icebergCatalog.tableExists(from)) {
      icebergCatalog.renameTable(from, to);
    } else {
      getSessionCatalog().renameTable(from, to);  // <= THIS PATH
    }
  }

getSessionCatalog().renameTable calls Hive APIs for the table in Glue Data Catalog, so it fails due to renaming failure.

Here’s the detail of calling flow (in Iceberg 1.4.1 with Spark 3.5):

  1. https://github.com/apache/iceberg/blob/apache-iceberg-1.4.1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/procedures/MigrateTableProcedure.java#L76 -> Calls MigrateTableSparkAction
  2. https://github.com/apache/iceberg/blob/apache-iceberg-1.4.1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/MigrateTableSparkAction.java#L118 -> The actual migration impl
    1. renameAndBackupSourceTable() is called to keep the backup
    2. https://github.com/apache/iceberg/blob/apache-iceberg-1.4.1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/actions/MigrateTableSparkAction.java#L209 -> renameAndBackupSourceTable. destCatalog().renameTable(...) will be called. But the destCatalog() is defined by this.destCatalog = checkDestinationCatalog(sourceCatalog); in the L66 in the same Class.
  3. https://github.com/apache/iceberg/blob/apache-iceberg-1.4.1/spark/v3.5/spark/src/main/java/org/apache/iceberg/spark/SparkSessionCatalog.java#L293 -> The destCatalog is SparkSessionCatalog in step 2, so the getSessionCatalog().renameTable will be called.

Resolution

If the GlueCatalog renameTable can be used to keep the backup table, it’s possible to run the migrate. To resolve this, for example, it’s possible to add a new option to specify the destination catalog.

wstevenson-r7 commented 9 months ago

For what it's worth I see the same issue with snapshot, any progress on this?

tomtongue commented 9 months ago

@wstevenson-r7 You can use snapshot for the Glue Data Catalog. Can you try setting catalog.db.tbl for your target table?

github-actions[bot] commented 3 months ago

This issue has been automatically marked as stale because it has been open for 180 days with no activity. It will be closed in next 14 days if no further activity occurs. To permanently prevent this issue from being considered stale, add the label 'not-stale', but commenting on the issue is preferred when possible.

github-actions[bot] commented 2 months ago

This issue has been closed because it has not received any activity in the last 14 days since being marked as 'stale'