linkedin / linkedin-gradle-plugin-for-apache-hadoop

Apache License 2.0
117 stars 76 forks source link

runSparkJob will fail if running python Spark job. #80

Closed convexshiba closed 8 years ago

convexshiba commented 8 years ago
  1. In the source code, it checks for the same logic as the old required fields. source code If Hadoop dsl checker will check this, is it necessary to have SparkPlugin check it again?
  2. Also the build spark command method will not produce the right command. Because it's doing this The --class option should be neglected if appClass is null.
convexquad commented 8 years ago

Ah - the Hadoop DSL static checker will only be applied if you are compiling to Azkaban. If you are running from the gateway, it won't apply the static checker (and should not), it should merely validate whatever fields are necessary to launch the job on the gateway. We definitely need to make the same set of patches to the task to accomodate PySpark jobs

convexshiba commented 8 years ago

Ah I see. I'll make the corresponding changes and send a PR. I'll ping you if I can't make it or run into any problems. Thanks Alex!

convexquad commented 8 years ago

This was closed by #83