Open ankamv opened 7 years ago
Hi @ankamv , thank you for reporting the issue.
Can you try doing the following in your pyspark script?
--jars <YOUR_LIB_ROOT>/scala-logging-slf4j_2.11-2.1.2.jar,<YOUR_LIB_ROOT>/scala-logging-api_2.11-2.1.2.jar
Also related, for testing purposes, please limit the size of your train_df
if you are running it locally.
Same issue again. I have 6 images of 1mb size in my train_df.
I see following jars from sc._conf.getAll() command.
('spark.jars', 'file:/home/hadoop/scala-logging-slf4j_2.11-2.1.2.jar,file:/home/hadoop/scala-logging-api_2.11-2.1.2.jar,file:/home/hadoop/.ivy2/jars/databricks_spark-deep-learning-0.1.0-spark2.1-s_2.11.jar,file:/home/hadoop/.ivy2/jars/databricks_tensorframes-0.2.9-s_2.11.jar,file:/home/hadoop/.ivy2/jars/com.typesafe.scala-logging_scala-logging-api_2.11-2.1.2.jar,file:/home/hadoop/.ivy2/jars/com.typesafe.scala-logging_scala-logging-slf4j_2.11-2.1.2.jar,file:/home/hadoop/.ivy2/jars/org.slf4j_slf4j-api-1.7.7.jar,file:/home/hadoop/.ivy2/jars/org.apache.commons_commons-proxy-1.0.jar,file:/home/hadoop/.ivy2/jars/org.scalactic_scalactic_2.11-3.0.0.jar,file:/home/hadoop/.ivy2/jars/org.apache.commons_commons-lang3-3.4.jar,file:/home/hadoop/.ivy2/jars/org.tensorflow_tensorflow-1.3.0.jar,file:/home/hadoop/.ivy2/jars/org.scala-lang_scala-reflect-2.11.8.jar,file:/home/hadoop/.ivy2/jars/org.tensorflow_libtensorflow-1.3.0.jar,file:/home/hadoop/.ivy2/jars/org.tensorflow_libtensorflow_jni-1.3.0.jar')
I tried adding two logging jars in --py-files but did not help.
Hi @ankamv Can you try running your example locally on your dev box or laptop to see if it works?
Hi @phi-dbq ,
I have the same problem as @ankamv . Tried starting PySpark with the following command
$SPARK_HOME/bin/pyspark --packages databricks:spark-deep-learning:1.5.0-spark2.4-s_2.11,databricks:tensorframes:0.8.2-s_2.11 --jars scala-logging_2.12-3.9.2.jar,scala-logging-slf4j_2.11-2.1.2.jar
but got the same error as in the original question.
I am using pyspark shell using below command on EMR cluster (Spark 2.1.1 and tried Python versions 2.7.12 and Anacoda Python 3.5.4)
pyspark --master local[2] --packages databricks:spark-deep-learning:0.1.0-spark2.1-s_2.11,databricks:tensorframes:0.2.9-s_2.11 --jars /home/hadoop/scala-logging-slf4j_2.11-2.1.2.jar
Trying to run logistic regression on a training data set. featurizer = DeepImageFeaturizer(inputCol="image", outputCol="features", modelName="InceptionV3") lr = LogisticRegression(maxIter=20, regParam=0.05, elasticNetParam=0.3, labelCol="label") p = Pipeline(stages=[featurizer, lr]) p_model = p.fit(train_df)
and getting below error