Closed chenchun closed 6 years ago
Thank you for this, as I am developing the integration tests right now to catch these errors.
rerun unit tests please
This change would need to be added to all the dockerfiles as this logic is shared amongst the -py and -r versions
LGTM. Thanks for fixing!
I have been successfully using 0612195f9027a4641b43c9d444fe8336cfeaa8c0 for a while to connect to S3 passing the credentials with spark.hadoop.fs.s3a...
.
I have update to latest commit (so this commit and the big one related to Kerberos HDFS), and the Authentication to S3 does not work anymore.
I don't know where this issue comes from but I have rolled-back and now everything's fine again. Could it be that the executor reads the conf from the given HADOOP_CONF_DIR and does not care about the given additional properties?
That could be the case. Maybe modifying the xml to include that? As it is probably overwriting with the HADOOP_CONF_DIR
@ifilonenko after hours of debugging suspects in moving zones, I have finally found the issue which is not the one I mentioned (so no worries, all good with this commit).
For the record, I was instantiating a HadoopConf object before accessing for the first time the spark read methods. This apparently lead to a hadoop conf available that spark was picking. That hadoop conf however was not feeded with the needed s3 properties, resulting in access denied (I needed that hadoop conf to perform Honky-Tonky work on the bare HDFS files).
The strange thing is that with spark-yarn
, this sequence works without problem, but with spark-k8s
it seems to give issues. I don't think we have to manage this, but maybe it should be documented...
Btw, looking at the last hdfs-kerberos
commit, I see you introduced an additional prop spark.kubernetes.hadoop.executor.hadoopConfigMapName
:
spark.kubernetes.hadoop.hadoopConfigMapName
(omit the executor part).spark.kubernetes.initcontainer.executor.configmapname
is in lower case, while spark.kubernetes.hadoop.executor.hadoopConfigMapName
in upper case.client
mode which had an hdfs configuration mounted via a config map (hdfs-k8s-hdfs-k8s-hdfs-dn
in my case) which works well. As I also want to support hdfs access in cluster
mode, I have tested passing spark.kubernetes.initcontainer.executor.configmapname=hdfs-k8s-hdfs-k8s-hdfs-dn
and I see a submission with SPARK_JAVA_OPT_14: -Dspark.kubernetes.hadoop.executor.hadoopConfigMapName=zeppelin-k8s-spark-1515498884837-hadoop-config
which is not what I was excepting (I was expecting the driver to benefit from my existing hadoop conf). My goal is to reuse an existing hadoop confi via config map in the cluster driver and executors. Am I missing something?My goal is to reuse an existing hadoop confi via config map in the cluster driver and executors.
@echarles I raised a same issue https://github.com/apache-spark-on-k8s/spark/issues/580
I think spark.kubernetes.hadoop.executor.hadoopConfigMapName
is currently an internal config, not a user config.
cc @ChenLingPeng . Can you submit your patch which makes it possible for driver/executor to reuse an existing hadoopconf configmap
Will do this ASAP
Thx @ChenLingPeng (cc/ @chenchun). From the code and behavior, I understand that the DriverConfigurationStepsOrchestrator will create a new hadoop config map each time a driver is created.
private val kubernetesResourceNamePrefix = s"$appName-$launchTime".toLowerCase.replaceAll("\\.", "-")
private val hadoopConfigMapName = s"$kubernetesResourceNamePrefix-hadoop-config"
Actually, I am looking a way to provide my own hadoopConfigMapName
via a spark property. Does you patch implement this?
@ChenLingPeng @chenchun My bad (not sure what I was doing...). Now I can mount an existing hadoop configmap in the driver. A new confimap is created based on the given one with that code in the DriverConfigurationStepsOrchestrator
hadoopConfDir.map { conf =>
val hadoopStepsOrchestrator =
new HadoopStepsOrchestrator(
kubernetesResourceNamePrefix,
namespace,
hadoopConfigMapName,
submissionSparkConf,
conf)
Once the driver is created with the correct hadoop configmap, the executors benefit with the classical spark.sparkContext.hadoopConfiguration
.
What changes were proposed in this pull request?
As the title says
How was this patch tested?
manual tests
I'm running a pagerank job which loads datasets from a HA HDFS with the following command.
The driver pod failed with exceptions in log.
It seems the exception is due to the fact that hadoop_conf_dir is missing from SPARK_CLASSPATH. After adding HADOOP_CONF_DIR to SPARK_CLASSPATH in driver/executor image, my job can run successfully.
Not sure why this Dockerfile change is missing compare https://github.com/apache-spark-on-k8s/spark/pull/540 to https://github.com/apache-spark-on-k8s/spark/pull/414