Open sehunley opened 8 years ago
Hello Scott,
Thank you very much for your email and introduction. I am not an expert in python, but I want to take a look at your python program, please share it so that I can look at it.
Thank you again! best regards, Mahmoud
On Feb 15, 2016, at 11:23 AM, sehunley notifications@github.com wrote:
When submitting a Java or Scala program, everything works fine. When submitting a python program, it's gets to the ACCEPTED state and then stalls. It eventually times out, but it's not getting picked up to run. Is this interface just for Java/Scala programs/jobs or should it be able to submit PySpark/Python jobs as well?
Thanks,
-Scott
— Reply to this email directly or view it on GitHub https://github.com/mahmoudparsian/data-algorithms-book/issues/4.
Hello Mahmoud, My Python program is pretty simple, it's a sample program that came with Spark 1.6.0. Our Data Scientists are doing their Machine Learning scripts in python, hence the need to call python programatically. Here is the program:
from future import print_function
import sys from random import random from operator import add
from pyspark import SparkContext
if name == "main": """ Usage: pi [partitions] """ sc = SparkContext(appName="PythonPi") partitions = int(sys.argv[1]) if len(sys.argv) > 1 else 2 n = 100000 * partitions
def f(_):
x = random() * 2 - 1
y = random() * 2 - 1
return 1 if x ** 2 + y ** 2 < 1 else 0
count = sc.parallelize(range(1, n + 1), partitions).map(f).reduce(add)
print("Pi is roughly %f" % (4.0 * count / n))
sc.stop()
Thanks in advance for any help that you can give. Here is a link to the actual output that happens when I run the code and try to submit the program above: [http://stackoverflow.com/questions/35373367/issue-submitting-a-python-application-to-yarn-from-java-code]
-Scott
When submitting a Java or Scala program, everything works fine. When submitting a python program, it's gets to the ACCEPTED state and then stalls. It eventually times out, but it's not getting picked up to run. Is this interface just for Java/Scala programs/jobs or should it be able to submit PySpark/Python jobs as well?
I am trying to invoke the pi.py sample program that comes with Spark 1.6.0.
Below is the java program that I am testing with. I'm new to Spark so apologies for any "newbie" errors.
import org.apache.spark.SparkConf; import org.apache.spark.deploy.yarn.Client; import org.apache.spark.deploy.yarn.ClientArguments; import org.apache.hadoop.conf.Configuration; // import org.apache.log4j.Logger;
/**
Usage: org.apache.spark.deploy.yarn.Client [options] Options: --jar JAR_PATH Path to your application's JAR file (required in yarn-cluster mode) --class CLASS_NAME Name of your application's main class (required) --primary-py-file A main Python file --arg ARG Argument to be passed to your application's main class. Multiple invocations are possible, each will be passed in order. --num-executors NUM Number of executors to start (Default: 2) --executor-cores NUM Number of cores per executor (Default: 1). --driver-memory MEM Memory for driver (e.g. 1000M, 2G) (Default: 512 Mb) --driver-cores NUM Number of cores used by the driver (Default: 1). --executor-memory MEM Memory per executor (e.g. 1000M, 2G) (Default: 1G) --name NAME The name of your application (Default: Spark) --queue QUEUE The hadoop queue to use for allocation requests (Default: 'default') --addJars jars Comma separated list of local jars that want SparkContext.addJar to work with. --py-files PY_FILES Comma-separated list of .zip, .egg, or .py files to place on the PYTHONPATH for Python apps. --files files Comma separated list of files to be distributed with the job. --archives archives Comma separated list of archives to be distributed with the job.
How to call this program example:
export SPARK_HOME="/Users/mparsian/spark-1.6.0" java -DSPARK_HOME="$SPARK_HOME" org.dataalgorithms.client.SubmitSparkPiToYARNFromJavaCode 10 */ public class SubmitSparkPiToYARNFromJavaCode {
public static void main(String[] args) throws Exception { long startTime = System.currentTimeMillis();
}
static void pi(String SPARK_HOME, String slices) throws Exception { //
String[] args = new String[]{ "--name", "Submit-SparkPi-To-Yarn", // "--driver-memory", "512MB", // "--jar", SPARK_HOME + "/examples/target/spark-examples_2.11-1.6.0.jar", // "--class", "org.apache.spark.examples.JavaSparkPi",
} }
Thanks,
-Scott