Open ash211 opened 7 years ago
I think if environment is duplicated then Kubernetes would throw an error.
@ash211 For minor docs change, could you make it more clear? For potential conflict, I implemented the conflict check at first but removed for below two reasons:
Maybe document a hint for users about this is a better idea?
@mccheah I did a test on duplicated custom envs, it won't fail and the environment is unique in container. But one interesting issue is that when we have space characters in any spark-submit conf options, the process will fail (not only when setting envs). This issue has been filed in https://github.com/apache-spark-on-k8s/spark/issues/440
I did a test on duplicated custom envs, it won't fail and the environment is unique in container.
Which environment variable does it select then? We can probably add a validation after the pod has been finally constructed. The validation shouldn't be a configuration step.
@mccheah It'll use the last value when setting custom variable and overriding internal spark env. You are saying a validation for conflicts with internal environment key?
We should probably validate this and throw an error. For now it can just be checking the list of environment variables and throwing an any that are duplicates, and an error message can be like "[Driver/Executor] environment variable X was given multiple values: [values]. If you did not set this multiple times, they might have been set by spark-submit
."
I agree w/ @mccheah, multiple settings may be a configuration bug and silently overwriting could make it hard to find
@mccheah @erikerlandson sorry for the late reply. I agree that at lease an message on overwriting should be thrown out. I'll add a check later.
Sorry I was slow to review and had post-merge comments for https://github.com/apache-spark-on-k8s/spark/pull/395#pullrequestreview-57641643 which was moved to https://github.com/apache-spark-on-k8s/spark/pull/424 and merged there (on branch-2.2-kubernetes instead of 2.1)
The points were:
Thoughts?
cc @tangzhankun