Closed xkrogen closed 7 years ago
In addition to the unit tests, what other tests do you plan to run?
@xkrogen
AFAIK. The spark code is moving outside azkaban-plugins
. I think this might break if/when that happens?
@HappyRay About other testing, I wanted to ask you about that. Do we have any good way to integration test these? I'm not familiar with the Azkaban develop/test process.
@suvodeep-pyne Where is it moving? Can we replicate this same logic there?
@xkrogen It is moving to an internal repo as far as I am aware. @Victsm can provide some context.
@xkrogen
We are moving the code outside of this repo in order to enable hot fix deployment for Spark job type plugin. Azkaban team cannot hot deploy Spark job type plugin if the plugin is inside this repo. We need to be able to generate Spark job type plugin JAR and publish it to artifactory in order to enable hot deployment which cuts down the delay for deploying changes to Spark job type.
@Victsm Okay, sounds good. We need this fix to be deployed by next week. I'm assuming the change will not happen by then? In any case we need this fix in the SparkJob plugin wherever it lives.
Updated the code to fix an issue with the previous code where it was relying on Spark being on the classpath (which it is not). Also refactored a bit per @HappyRay 's comments. I tested this on a live Azkaban cluster and confirmed that both Hadoop-type and Spark-type jobs respected this configuration successfully.
Modify the Hadoop(Spark,Hive,Pig,Java)Job types to respect the MapReduce- and Spark-level configurations for additional namenodes from which to fetch delegation tokens, in addition to the Azkaban-level other_namenodes configuration. MapReduce and Spark both have configurations for specifying namenodes from which additional delegation tokens should be fetched before submitting a job (
mapreduce.job.hdfs-servers
andspark.yarn.access.namenodes
, respectively) but previously Azkaban ignored these, using it's ownother_namenodes
property. Azkaban should additionally respect these framework-level configurations.