Closed maruppel closed 1 month ago
@maruppel : Thank you for opening the issue. To make sure that we cover your issue, could you specify what kind of table you try to migrate:
I expect it to be a "table in mount" as that is the only place where we specify the schema explicitly, the others use a "SELECT * FROM ..."
All the tables with the error are dbfs root tables.
Seems like the issue persists, see log running version 0.37.0. ` details = "INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name "" is not a valid name" debug_error_string = "UNKNOWN:Error received from peer unix:/databricks/sparkconnect/grpc.sock {grpc_message:"INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name \"\" is not a valid name", grpc_status:13, created_time:"2024-09-24T15:12:37.144341818+00:00"}"
15:12:37 WARN [d.l.u.hive_metastore.table_migrate][migrate_tables_1] failed-to-migrate: Failed to migrate table hive_metastore.[schema].[table] to [catalog].[schema].[table]: INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name "" is not a valid name`
hmm, do you get more information in the debug logs? See the logs
folder in the UCX installation folder in your workspace
hmm, do you get more information in the debug logs? See the
logs
folder in the UCX installation folder in your workspace
No other details other than the stack trace.
Could you share the stack trace? I would like to see which table migration method is called
Could you share the stack trace? I would like to see which table migration method is called
JVM stacktrace: org.apache.spark.sql.AnalysisException at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException(ErrorDetailsHandler.scala:62) at com.databricks.managedcatalog.ErrorDetailsHandler.wrapServiceException$(ErrorDetailsHandler.scala:35) at com.databricks.managedcatalog.ManagedCatalogClientImpl.wrapServiceException(ManagedCatalogClientImpl.scala:171) at com.databricks.managedcatalog.ManagedCatalogClientImpl.recordAndWrapException(ManagedCatalogClientImpl.scala:5648) at com.databricks.managedcatalog.ManagedCatalogClientImpl.createTable(ManagedCatalogClientImpl.scala:1278) at com.databricks.sql.managedcatalog.PermissionEnforcingManagedCatalog.createManagedOrExternalTable(PermissionEnforcingManagedCatalog.scala:121) at com.databricks.sql.managedcatalog.PermissionEnforcingManagedCatalog.createTable(PermissionEnforcingManagedCatalog.scala:404) at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$createTable$1(ProfiledManagedCatalog.scala:181) at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23) at org.apache.spark.sql.catalyst.MetricKeyUtils$.measure(MetricKey.scala:1056) at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.$anonfun$profile$1(ProfiledManagedCatalog.scala:62) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.profile(ProfiledManagedCatalog.scala:61) at com.databricks.sql.managedcatalog.ProfiledManagedCatalog.createTable(ProfiledManagedCatalog.scala:181) at com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTableInternal(ManagedCatalogSessionCatalog.scala:924) at com.databricks.sql.managedcatalog.ManagedCatalogSessionCatalog.createTable(ManagedCatalogSessionCatalog.scala:851) at com.databricks.sql.DatabricksSessionCatalog.createTable(DatabricksSessionCatalog.scala:225) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.updateCatalog(CreateDeltaTableCommand.scala:912) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.runPostCommitUpdates(CreateDeltaTableCommand.scala:286) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.handleCommit(CreateDeltaTableCommand.scala:266) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.$anonfun$run$2(CreateDeltaTableCommand.scala:171) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag(DeltaLogging.scala:225) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.withOperationTypeTag$(DeltaLogging.scala:212) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.withOperationTypeTag(CreateDeltaTableCommand.scala:71) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$2(DeltaLogging.scala:164) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile(DeltaLogging.scala:294) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordFrameProfile$(DeltaLogging.scala:292) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordFrameProfile(CreateDeltaTableCommand.scala:71) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.$anonfun$recordDeltaOperationInternal$1(DeltaLogging.scala:163) at com.databricks.logging.UsageLogging.$anonfun$recordOperation$1(UsageLogging.scala:525) at com.databricks.logging.UsageLogging.executeThunkAndCaptureResultTags$1(UsageLogging.scala:629) at com.databricks.logging.UsageLogging.$anonfun$recordOperationWithResultTags$4(UsageLogging.scala:647) at com.databricks.logging.AttributionContextTracing.$anonfun$withAttributionContext$1(AttributionContextTracing.scala:48) at com.databricks.logging.AttributionContext$.$anonfun$withValue$1(AttributionContext.scala:244) at scala.util.DynamicVariable.withValue(DynamicVariable.scala:62) at com.databricks.logging.AttributionContext$.withValue(AttributionContext.scala:240) at com.databricks.logging.AttributionContextTracing.withAttributionContext(AttributionContextTracing.scala:46) at com.databricks.logging.AttributionContextTracing.withAttributionContext$(AttributionContextTracing.scala:43) at com.databricks.spark.util.PublicDBLogging.withAttributionContext(DatabricksSparkUsageLogger.scala:27) at com.databricks.logging.AttributionContextTracing.withAttributionTags(AttributionContextTracing.scala:95) at com.databricks.logging.AttributionContextTracing.withAttributionTags$(AttributionContextTracing.scala:76) at com.databricks.spark.util.PublicDBLogging.withAttributionTags(DatabricksSparkUsageLogger.scala:27) at com.databricks.logging.UsageLogging.recordOperationWithResultTags(UsageLogging.scala:624) at com.databricks.logging.UsageLogging.recordOperationWithResultTags$(UsageLogging.scala:534) at com.databricks.spark.util.PublicDBLogging.recordOperationWithResultTags(DatabricksSparkUsageLogger.scala:27) at com.databricks.logging.UsageLogging.recordOperation(UsageLogging.scala:526) at com.databricks.logging.UsageLogging.recordOperation$(UsageLogging.scala:494) at com.databricks.spark.util.PublicDBLogging.recordOperation(DatabricksSparkUsageLogger.scala:27) at com.databricks.spark.util.PublicDBLogging.recordOperation0(DatabricksSparkUsageLogger.scala:68) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:150) at com.databricks.spark.util.UsageLogger.recordOperation(UsageLogger.scala:68) at com.databricks.spark.util.UsageLogger.recordOperation$(UsageLogger.scala:55) at com.databricks.spark.util.DatabricksSparkUsageLogger.recordOperation(DatabricksSparkUsageLogger.scala:109) at com.databricks.spark.util.UsageLogging.recordOperation(UsageLogger.scala:429) at com.databricks.spark.util.UsageLogging.recordOperation$(UsageLogger.scala:408) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordOperation(CreateDeltaTableCommand.scala:71) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperationInternal(DeltaLogging.scala:162) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation(DeltaLogging.scala:152) at com.databricks.sql.transaction.tahoe.metering.DeltaLogging.recordDeltaOperation$(DeltaLogging.scala:142) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.recordDeltaOperation(CreateDeltaTableCommand.scala:71) at com.databricks.sql.transaction.tahoe.commands.CreateDeltaTableCommand.run(CreateDeltaTableCommand.scala:149) at org.apache.spark.sql.execution.command.ExecutedCommandExec.$anonfun$sideEffectResult$2(commands.scala:84) at org.apache.spark.sql.execution.SparkPlan.runCommandWithAetherOff(SparkPlan.scala:180) at org.apache.spark.sql.execution.SparkPlan.runCommandInAetherOrSpark(SparkPlan.scala:191) at org.apache.spark.sql.execution.command.ExecutedCommandExec.$anonfun$sideEffectResult$1(commands.scala:84) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:81) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:80) at org.apache.spark.sql.execution.command.ExecutedCommandExec.executeCollect(commands.scala:94) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$5(QueryExecution.scala:382) at com.databricks.util.LexicalThreadLocal$Handle.runWith(LexicalThreadLocal.scala:63) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$4(QueryExecution.scala:382) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withTracker(QueryPlanningTracker.scala:186) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$3(QueryExecution.scala:382) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$9(SQLExecution.scala:400) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:719) at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId0$1(SQLExecution.scala:278) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1179) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId0(SQLExecution.scala:165) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:656) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$2(QueryExecution.scala:378) at org.apache.spark.sql.execution.QueryExecution$.withInternalError(QueryExecution.scala:1176) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.$anonfun$applyOrElse$1(QueryExecution.scala:374) at org.apache.spark.sql.execution.QueryExecution.org$apache$spark$sql$execution$QueryExecution$$withMVTagsIfNecessary(QueryExecution.scala:325) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:371) at org.apache.spark.sql.execution.QueryExecution$$anonfun$$nestedInanonfun$eagerlyExecuteCommands$1$1.applyOrElse(QueryExecution.scala:347) at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:505) at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:83) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:505) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:40) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:379) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:375) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:40) at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:40) at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:481) at org.apache.spark.sql.execution.QueryExecution.$anonfun$eagerlyExecuteCommands$1(QueryExecution.scala:347) at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper$.allowInvokingTransformsInAnalyzer(AnalysisHelper.scala:436) at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:347) at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:284) at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:281) at org.apache.spark.sql.Dataset.<init>(Dataset.scala:339) at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:131) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1179) at org.apache.spark.sql.SparkSession.$anonfun$withActiveAndFrameProfiler$1(SparkSession.scala:1186) at com.databricks.spark.util.FrameProfiler$.record(FrameProfiler.scala:94) at org.apache.spark.sql.SparkSession.withActiveAndFrameProfiler(SparkSession.scala:1186) at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:122) at org.apache.spark.sql.SparkSession.$anonfun$sql$6(SparkSession.scala:957) at org.apache.spark.sql.catalyst.QueryPlanningTracker$.withMainTracker(QueryPlanningTracker.scala:179) at org.apache.spark.sql.SparkSession.$anonfun$sql$5(SparkSession.scala:945) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1179) at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:945) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.executeSQL(SparkConnectPlanner.scala:2902) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.handleSqlCommand(SparkConnectPlanner.scala:2744) at org.apache.spark.sql.connect.planner.SparkConnectPlanner.process(SparkConnectPlanner.scala:2685) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.handleCommand(ExecuteThreadRunner.scala:319) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1(ExecuteThreadRunner.scala:242) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.$anonfun$executeInternal$1$adapted(ExecuteThreadRunner.scala:178) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$2(SessionHolder.scala:333) at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:1179) at org.apache.spark.sql.connect.service.SessionHolder.$anonfun$withSession$1(SessionHolder.scala:333) at org.apache.spark.JobArtifactSet$.withActiveJobArtifactState(JobArtifactSet.scala:97) at org.apache.spark.sql.artifact.ArtifactManager.$anonfun$withResources$1(ArtifactManager.scala:85) at org.apache.spark.util.Utils$.withContextClassLoader(Utils.scala:235) at org.apache.spark.sql.artifact.ArtifactManager.withResources(ArtifactManager.scala:84) at org.apache.spark.sql.connect.service.SessionHolder.withSession(SessionHolder.scala:332) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.executeInternal(ExecuteThreadRunner.scala:178) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner.org$apache$spark$sql$connect$execution$ExecuteThreadRunner$$execute(ExecuteThreadRunner.scala:128) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.$anonfun$run$2(ExecuteThreadRunner.scala:526) at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:45) at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:104) at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:109) at scala.util.Using$.resource(Using.scala:269) at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:108) at org.apache.spark.sql.connect.execution.ExecuteThreadRunner$ExecutionThread.run(ExecuteThreadRunner.scala:525)
Are these Delta/Non Delta Tables?
Can you show us the table info from the assessment dashboard?
Are these Delta/Non Delta Tables?
All are dbfs root delta tables.
Is there also the Python stack trace to see which Python code triggered this
Is there also the Python stack trace to see which Python code triggered this
15:20:12 ERROR [p.s.c.client.logging][migrate_tables_7] GRPC Error received: Traceback (most recent call last):
File "/databricks/spark/python/pyspark/sql/connect/client/core.py", line 1736, in _execute_and_fetch_as_iterator
for b in generator:
File "<frozen _collections_abc>", line 330, in __next__
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 135, in send
if not self._has_next():
^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 196, in _has_next
raise e
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 168, in _has_next
self._current = self._call_iter(
^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 291, in _call_iter
raise e
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 271, in _call_iter
return iter_fun()
^^^^^^^^^^
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 169, in <lambda>
lambda: next(self._iterator) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^
File "/databricks/python/lib/python3.11/site-packages/grpc/_channel.py", line 540, in __next__
return self._next()
^^^^^^^^^^^^
File "/databricks/python/lib/python3.11/site-packages/grpc/_channel.py", line 966, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.INTERNAL
details = "INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name "" is not a valid name"
debug_error_string = "UNKNOWN:Error received from peer unix:/databricks/sparkconnect/grpc.sock {grpc_message:"INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name \"\" is not a valid name", grpc_status:13, created_time:"2024-09-24T15:20:12.518225701+00:00"}"
>
2024-09-24 15:20:12,523 2361 ERROR _handle_rpc_error GRPC Error received
Traceback (most recent call last):
File "/databricks/spark/python/pyspark/sql/connect/client/core.py", line 1736, in _execute_and_fetch_as_iterator
for b in generator:
File "<frozen _collections_abc>", line 330, in __next__
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 135, in send
if not self._has_next():
^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 196, in _has_next
raise e
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 168, in _has_next
self._current = self._call_iter(
^^^^^^^^^^^^^^^^
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 291, in _call_iter
raise e
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 271, in _call_iter
return iter_fun()
^^^^^^^^^^
File "/databricks/spark/python/pyspark/sql/connect/client/reattach.py", line 169, in <lambda>
lambda: next(self._iterator) # type: ignore[arg-type]
^^^^^^^^^^^^^^^^^^^^
File "/databricks/python/lib/python3.11/site-packages/grpc/_channel.py", line 540, in __next__
return self._next()
^^^^^^^^^^^^
File "/databricks/python/lib/python3.11/site-packages/grpc/_channel.py", line 966, in _next
raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
status = StatusCode.INTERNAL
details = "INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name "" is not a valid name"
debug_error_string = "UNKNOWN:Error received from peer unix:/databricks/sparkconnect/grpc.sock {grpc_message:"INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name \"\" is not a valid name", grpc_status:13, created_time:"2024-09-24T15:20:12.518225701+00:00"}"
>
15:20:12 WARN [d.l.u.hive_metastore.table_migrate][migrate_tables_7] failed-to-migrate: Failed to migrate table [REDACTED]: INVALID_PARAMETER_VALUE: Invalid input: RPC CreateTable Field managedcatalog.ColumnInfo.name: At columns.0: name "" is not a valid name
Is there an existing issue for this?
Current Behavior
migrate-tables unable to migrate table with decimals as a column name, i.e. 10.0.
Expected Behavior
Tables are migrated with non-standard column names wrapped in quotes.
Steps To Reproduce
No response
Cloud
Azure
Operating System
Windows
Version
latest via Databricks CLI
Relevant log output