apache / incubator-gluten

Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
https://gluten.apache.org/
Apache License 2.0
1.19k stars 434 forks source link

how to add libhdfs conf when running spark-sql #6728

Open lihao712 opened 2 months ago

lihao712 commented 2 months ago

Backend

VL (Velox)

Bug description

when I what to running spark sql with gluten with hdfs support, I add spark.executorEnv.LIBHDFS3_CONF="/path/to/hdfs-client.xml in spark.defaults.conf, but this path in running sql can't be read by executor, but --files way can't be used in spark-sql query, so what should I do? maybe add this hdfs-client.xml to all clister node path? it's too expensive and unusable,should someway can help me sovle this problem?

Spark version

None

Spark configurations

No response

System information

No response

Relevant logs

No response

ArnavBalyan commented 2 months ago

Hi @lihao712, you can place the file on HDFS, and providing the HDFS path should work. "spark.executorEnv.LIBHDFS3_CONF": "<HDFS Path to XML>" cc @FelixYBW for confirmation thanks!

PHILO-HE commented 2 months ago

@ArnavBalyan is right.

@lihao712, you can refer to this document for getting some details: https://github.com/apache/incubator-gluten/blob/main/docs/get-started/Velox.md#configuration-about-hdfs-support