Closed SurajAralihalli closed 7 months ago
What is the table format -- is it a Delta Lake table, a raw Parquet table, or something else? A stacktrace of the error would help.
Assuming this is with a table that's ultimately comprised of Parquet files, does this happen even with the config spark.rapids.sql.format.parquet.reader.type=PERFILE? If it works with the PERFILE reader, then that tells us the issue is with setting up the proper context for the multithreaded readers.
Yes, its a Delta lake table. It didn't work with the spark.rapids.sql.format.parquet.reader.type=PERFILE
. However works when I explicitly configure the fs.azure.account.key
in spark properties.
Stack Trace:
: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 22.0 failed 4 times, most recent failure: Lost task 0.3 in stage 22.0 (TID 28) (10.9.4.10 executor 0): Failure to initialize configuration for storage account databricksmetaeast.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:52)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AbfsConfiguration.getStorageAccountKey(AbfsConfiguration.java:670)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.initializeClient(AzureBlobFileSystemStore.java:2055)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystemStore.<init>(AzureBlobFileSystemStore.java:267)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.AzureBlobFileSystem.initialize(AzureBlobFileSystem.java:225)
at com.databricks.common.filesystem.LokiFileSystem$.$anonfun$getLokiFS$1(LokiFileSystem.scala:64)
at com.databricks.common.filesystem.Cache.getOrCompute(Cache.scala:38)
at com.databricks.common.filesystem.LokiFileSystem$.getLokiFS(LokiFileSystem.scala:61)
at com.databricks.common.filesystem.LokiFileSystem.initialize(LokiFileSystem.scala:87)
at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:3469)
at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:537)
at org.apache.hadoop.fs.Path.getFileSystem(Path.java:365)
at org.apache.parquet.hadoop.util.HadoopInputFile.fromPath(HadoopInputFile.java:43)
at org.apache.parquet.hadoop.ParquetFileReader.readFooter(ParquetFileReader.java:482)
at com.nvidia.spark.rapids.GpuParquetFileFilterHandler.$anonfun$readAndSimpleFilterFooter$11(GpuParquetScan.scala:676)
at scala.Option.getOrElse(Option.scala:189)
at com.nvidia.spark.rapids.GpuParquetFileFilterHandler.$anonfun$readAndSimpleFilterFooter$6(GpuParquetScan.scala:675)
at scala.Option.getOrElse(Option.scala:189)
at com.nvidia.spark.rapids.GpuParquetFileFilterHandler.$anonfun$readAndSimpleFilterFooter$1(GpuParquetScan.scala:652)
at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
at com.nvidia.spark.rapids.GpuParquetFileFilterHandler.readAndSimpleFilterFooter(GpuParquetScan.scala:643)
at com.nvidia.spark.rapids.GpuParquetFileFilterHandler.$anonfun$filterBlocks$1(GpuParquetScan.scala:728)
at com.nvidia.spark.rapids.Arm$.withResource(Arm.scala:29)
at com.nvidia.spark.rapids.GpuParquetFileFilterHandler.filterBlocks(GpuParquetScan.scala:689)
at com.nvidia.spark.rapids.GpuParquetPartitionReaderFactory.buildBaseColumnarParquetReader(GpuParquetScan.scala:1338)
at com.nvidia.spark.rapids.GpuParquetPartitionReaderFactory.buildColumnarReader(GpuParquetScan.scala:1328)
at com.nvidia.spark.rapids.PartitionReaderIterator$.$anonfun$buildReader$1(PartitionReaderIterator.scala:66)
at org.apache.spark.sql.rapids.shims.GpuFileScanRDD$$anon$1.org$apache$spark$sql$rapids$shims$GpuFileScanRDD$$anon$$readCurrentFile(GpuFileScanRDD.scala:97)
at org.apache.spark.sql.rapids.shims.GpuFileScanRDD$$anon$1.nextIterator(GpuFileScanRDD.scala:151)
at org.apache.spark.sql.rapids.shims.GpuFileScanRDD$$anon$1.hasNext(GpuFileScanRDD.scala:74)
at org.apache.spark.sql.rapids.GpuFileSourceScanExec$$anon$1.hasNext(GpuFileSourceScanExec.scala:474)
at com.nvidia.spark.rapids.DynamicGpuPartialSortAggregateIterator.$anonfun$hasNext$4(GpuAggregateExec.scala:1930)
at scala.runtime.java8.JFunction0$mcZ$sp.apply(JFunction0$mcZ$sp.java:23)
at scala.Option.getOrElse(Option.scala:189)
at com.nvidia.spark.rapids.DynamicGpuPartialSortAggregateIterator.hasNext(GpuAggregateExec.scala:1930)
at org.apache.spark.sql.rapids.execution.GpuShuffleExchangeExecBase$$anon$1.partNextBatch(GpuShuffleExchangeExecBase.scala:332)
at org.apache.spark.sql.rapids.execution.GpuShuffleExchangeExecBase$$anon$1.hasNext(GpuShuffleExchangeExecBase.scala:355)
at org.apache.spark.shuffle.sort.BypassMergeSortShuffleWriter.write(BypassMergeSortShuffleWriter.java:151)
at org.apache.spark.shuffle.ShuffleWriteProcessor.write(ShuffleWriteProcessor.scala:59)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$3(ShuffleMapTask.scala:81)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.$anonfun$runTask$1(ShuffleMapTask.scala:81)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:53)
at org.apache.spark.scheduler.ShuffleMapTask.runTask(ShuffleMapTask.scala:41)
at org.apache.spark.scheduler.Task.doRunTask(Task.scala:179)
at org.apache.spark.scheduler.Task.$anonfun$run$5(Task.scala:142)
at com.databricks.unity.UCSEphemeralState$Handle.runWith(UCSEphemeralState.scala:41)
at com.databricks.unity.HandleImpl.runWith(UCSHandle.scala:99)
at com.databricks.unity.HandleImpl.$anonfun$runWithAndClose$1(UCSHandle.scala:104)
at scala.util.Using$.resource(Using.scala:269)
at com.databricks.unity.HandleImpl.runWithAndClose(UCSHandle.scala:103)
at org.apache.spark.scheduler.Task.$anonfun$run$1(Task.scala:142)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.scheduler.Task.run(Task.scala:97)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$13(Executor.scala:904)
at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1740)
at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$4(Executor.scala:907)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at com.databricks.spark.util.ExecutorFrameProfiler$.record(ExecutorFrameProfiler.scala:110)
at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:761)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
Caused by: Invalid configuration value detected for fs.azure.account.key
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.ConfigurationBasicValidator.validate(ConfigurationBasicValidator.java:49)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.diagnostics.Base64StringConfigurationBasicValidator.validate(Base64StringConfigurationBasicValidator.java:40)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.validateStorageAccountKey(SimpleKeyProvider.java:71)
at shaded.databricks.azurebfs.org.apache.hadoop.fs.azurebfs.services.SimpleKeyProvider.getStorageAccountKey(SimpleKeyProvider.java:49)
... 63 more
@mattahrens @sameerz Is this related to #8242?
@mattahrens @sameerz Is this related to #8242?
Yes it is related. We do not need to test with Alluxio, but with filecache.
Did you resolve this issue?
yes this issue is fixed in release 24.06.02.
If you are seeing this issue with that version of the spark rapids plugin or newer please file an issue with the details and we can take a look.
Describe the bug While using Azure Databricks and attempting to read a Managed Table from the Unity Catalog Metastore with the RAPIDS Accelerator, I encountered invalid credentials issue with the following message:
Failure to initialize configuration for storage account databricksmetaeast.dfs.core.windows.net: Invalid configuration value detected for fs.azure.account.keyInvalid configuration value detected for fs.azure.account.key.
However, this error doesn't occur when RAPIDS is disabled.Notes Adding the credentials of the storage container to the Spark configuration properties can serve as an interim solution. However, this approach is not scalable when there are multiple storage containers.
Environment details Managed Tables on Azure Databricks with Unity Catalog and RAPIDS Accelerator