Closed rvesse closed 6 years ago
Ah, yes, the Kerberos support was not added for the init-container. @ifilonenko .
The secret with the appropriate Kerberos is mounted after the init-container is launched. As such it would require for you to pre-populate the secret with your job users’ delegation token for the init container to see it. In our upstreaming process we are removing the init-container and launching spark-submit from the driver. As such, the init-container will soon be deprecated.
@ifilonenko In which branch/repo is that? We need to get Kerberos support usable for our customers ASAP so we are happy to use a cutting edge branch if necessary
Kerberos should be enabled on branch-2.2-kubernetes as I have tested this myself. It just doesn’t support interaction from the init-container as that wasn’t a use case that we thought was necessary at that point in time.
@ifilonenko I think we should mount the same secret that stores the delegation token into the init-container. This is the case for general secrets: we always mount each user-specified secret into both the init-container and main container.
Working on a fix for this internally, will post a PR once I have validated the fix
thanks @rvesse :) I will review it.
When trying to run a job that requires the use of the
--files
flag to pre-load files into the container it seems that the init container does not include the Kerberos login logic which results in failure to download the dependencies thus failing the entire job.Looking at the PR that added Secure HDFS support (#540) I don't see any sign that the init container logic was modified so it appears that this was not included.
Submission Line
test2.py
is just a toy Spark job, the contents are irrelevant here because the job fails before they are ever consumed but I would note that the same job runs fine on an unsecured HDFS cluster.Resulting Logs
Job eventually fails,
kubectl describe pods
shows that the init container failed, and the following are the logs from that container:So it looks like the init container isn't recognising that it should be useful Kerberos login for HDFS