Closed ZeweiChen11 closed 5 years ago
@jenniew please take a look
Hi , I am also facing the same issue when am trying to Run object detection code with jupyter.
export SPARK_HOME=/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark export ANALYTICS_ZOO_HOME=/root/Desktop/analytics-zoo/dist MASTER=local[*] ${ANALYTICS_ZOO_HOME}/bin/jupyter-with-zoo.sh \
--master ${MASTER} \ --driver-cores 2 \ --driver-memory 8g \ --total-executor-cores 2 \ --executor-cores 2 \ --executor-memory 8g
As soon as I press enter am getting the error as follows:
Exception in thread "main" java.lang.IllegalArgumentException: pyspark does not support any application options. at org.apache.spark.launcher.CommandBuilderUtils.checkArgument(CommandBuilderUtils.java:242) at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildPySparkShellCommand(SparkSubmitCommandBuilder.java:241) at org.apache.spark.launcher.SparkSubmitCommandBuilder.buildCommand(SparkSubmitCommandBuilder.java:117) at org.apache.spark.launcher.Main.main(Main.java:86)
What could be the possible reason?
@jason-dai Please help me out.
@dding3 please take a look.
@BhagyasriYella Did you make any changes to jupyter-with-zoo.sh? If not, could you please check if pyspark work?
${SPARK_HOME}/bin/pyspark
, then from pyspark import SparkContext
Usually pyspark does not support any application options
exception is caused by the options are not properly passed
@dding3 I did not make any changes to jupyter-with-zoo.sh. I am new to this so please correct me if am wrong. I have tried ${SPARK_HOME}/bin/pyspark. it gave me following:
bigdatapoc01:~ # export SPARK_HOME=/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark bigdatapoc01:~ # ${SPARK_HOME}/bin/pyspark Python 3.6.0 |Anaconda 4.3.0 (64-bit)| (default, Dec 23 2016, 12:22:00) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux Type "help", "copyright", "credits" or "license" for more information. Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/shell.py", line 30, in
import pyspark File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/init.py", line 41, in from pyspark.context import SparkContext File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/context.py", line 33, in from pyspark.java_gateway import launch_gateway File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/java_gateway.py", line 31, in from py4j.java_gateway import java_import, JavaGateway, GatewayClient File "/root/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", line 18, in from pydoc import pager File "/root/anaconda3/lib/python3.6/pydoc.py", line 59, in import inspect File "/root/anaconda3/lib/python3.6/inspect.py", line 334, in Attribute = namedtuple('Attribute', 'name kind defining_class object') File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/serializers.py", line 381, in namedtuple cls = _old_namedtuple(*args, **kwargs) TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module' from pyspark import SparkContext Traceback (most recent call last): File "
", line 1, in File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/init.py", line 41, in from pyspark.context import SparkContext File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/context.py", line 33, in from pyspark.java_gateway import launch_gateway File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/java_gateway.py", line 31, in from py4j.java_gateway import java_import, JavaGateway, GatewayClient File "/root/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", line 18, in from pydoc import pager File "/root/anaconda3/lib/python3.6/pydoc.py", line 59, in import inspect File "/root/anaconda3/lib/python3.6/inspect.py", line 334, in Attribute = namedtuple('Attribute', 'name kind defining_class object') File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/serializers.py", line 381, in namedtuple cls = _old_namedtuple(*args, **kwargs) TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'
I am afraid there is something wrong with your spark environment as there is exception when you start pyspark. We need fix it before run object detection notebook.
I noticed you are using python 3.6, what's your spark version, it can be checked by ${SPARK_HOME}/bin/spark-submit --version
. Spark <= 2.1.0 is not compatible with Python 3.6. I found there is similar issue if run Spark <= 2.1 with python 3.6. https://stackoverflow.com/questions/42349980/unable-to-run-pyspark
My wild guess is a problem of python version. It’s worth a try to use python 2.7 or python 3.5
发自我的 iPhone
在 2018年8月2日,下午6:26,BhagyasriYella notifications@github.com<mailto:notifications@github.com> 写道:
@dding3https://github.com/dding3 I did not make any changes to jupyter-with-zoo.sh. I am new to this so please correct me if am wrong. I have tried ${SPARK_HOME}/bin/pyspark. it gave me following:
bigdatapoc01:~ # export SPARK_HOME=/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark bigdatapoc01:~ # ${SPARK_HOME}/bin/pyspark Python 3.6.0 |Anaconda 4.3.0 (64-bit)| (default, Dec 23 2016, 12:22:00) [GCC 4.4.7 20120313 (Red Hat 4.4.7-1)] on linux Type "help", "copyright", "credits" or "license" for more information. Traceback (most recent call last): File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/shell.py", line 30, in import pyspark File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/init.py", line 41, in from pyspark.context import SparkContext File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/context.py", line 33, in from pyspark.java_gateway import launch_gateway File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/java_gateway.py", line 31, in from py4j.java_gateway import java_import, JavaGateway, GatewayClient File "/root/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", line 18, in from pydoc import pager File "/root/anaconda3/lib/python3.6/pydoc.py", line 59, in import inspect File "/root/anaconda3/lib/python3.6/inspect.py", line 334, in Attribute = namedtuple('Attribute', 'name kind defining_class object') File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/serializers.py", line 381, in namedtuple cls = _old_namedtuple(*args, **kwargs) TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'
from pyspark import SparkContext Traceback (most recent call last): File "", line 1, in File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/init.py", line 41, in from pyspark.context import SparkContext File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/context.py", line 33, in from pyspark.java_gateway import launch_gateway File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/java_gateway.py", line 31, in from py4j.java_gateway import java_import, JavaGateway, GatewayClient File "/root/anaconda3/lib/python3.6/site-packages/py4j/java_gateway.py", line 18, in from pydoc import pager File "/root/anaconda3/lib/python3.6/pydoc.py", line 59, in import inspect File "/root/anaconda3/lib/python3.6/inspect.py", line 334, in Attribute = namedtuple('Attribute', 'name kind defining_class object') File "/opt/cloudera/parcels/CDH-5.13.2-1.cdh5.13.2.p0.3/lib/spark/python/pyspark/serializers.py", line 381, in namedtuple cls = _old_namedtuple(*args, **kwargs) TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'
― You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/intel-analytics/analytics-zoo-internal/issues/1237, or mute the threadhttps://github.com/notifications/unsubscribe-auth/ACiegVNn5RBfJFpwyweZQYR3xUO1VQFhks5uMtPDgaJpZM4ULvCC.
Hi @BhagyasriYella The error you mentioned TypeError: namedtuple() missing 3 required keyword-only arguments: 'verbose', 'rename', and 'module'
is due to the incompatibility of earlier versions of Spark with Python 3.6. Please see this issue for more discussions: https://issues.apache.org/jira/browse/SPARK-19019
This has been fixed in Spark 1.6.4, 2.0.3, 2.1.1, 2.2.0. If you are using Python 2.6, it is recommended that you use spark>=2.2.0 to run. Would you mind switching your spark version and have another try? Thanks.
The original issue is caused by bigdl issue : https://github.com/intel-analytics/BigDL/issues/2558
@dding3 Thanks Ding It worked after I have updated the spark version. Thanks a lot.
You are very welcome. We are glad to help :)
The same issue happens in examples/nnframes/inference. 50K imagenet val images:
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 19 in stage 2.0 failed 4 times, most recent failure: Lost task 19.3 in stage 2.0 (TID 27, emr-worker-4.cluster-74716, executor 1): java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at com.intel.analytics.bigdl.tensor.TensorNumericMath$TensorNumeric$NumericFloat$.arraycopy$mcF$sp(TensorNumeric.scala:721)
at com.intel.analytics.bigdl.tensor.TensorNumericMath$TensorNumeric$NumericFloat$.arraycopy(TensorNumeric.scala:715)
at com.intel.analytics.bigdl.tensor.TensorNumericMath$TensorNumeric$NumericFloat$.arraycopy(TensorNumeric.scala:503)
at com.intel.analytics.bigdl.dataset.MiniBatch$.copy(MiniBatch.scala:460)
at com.intel.analytics.bigdl.dataset.MiniBatch$.copyWithPadding(MiniBatch.scala:380)
at com.intel.analytics.bigdl.dataset.ArrayTensorMiniBatch.set(MiniBatch.scala:209)
at com.intel.analytics.bigdl.dataset.ArrayTensorMiniBatch.set(MiniBatch.scala:111)
at com.intel.analytics.bigdl.dataset.SampleToMiniBatch$$anon$2.next(Transformer.scala:348)
at com.intel.analytics.bigdl.dataset.SampleToMiniBatch$$anon$2.next(Transformer.scala:323)
at scala.collection.Iterator$$anon$12.nextCur(Iterator.scala:434)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:440)
at scala.collection.Iterator$$anon$19.hasNext(Iterator.scala:800)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:438)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
at org.apache.spark.sql.catalyst.expressions.GeneratedClass
10K imagenet val images
Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 2.0 failed 4 times, most recent failure: Lost task 3.3 in stage 2.0 (TID 15, emr-worker-3.cluster-74716, executor 2): Layer info: StaticGraph[GoogleNet]/SpatialConvolution[conv1/7x7_s2](3 -> 64, 7 x 7, 2, 2, 3, 3)
java.lang.IllegalArgumentException: requirement failed: input channel size 30 is not the same as nInputPlane 3
at scala.Predef$.require(Predef.scala:224)
at com.intel.analytics.bigdl.nn.SpatialConvolution.updateOutput(SpatialConvolution.scala:262)
at com.intel.analytics.bigdl.nn.SpatialConvolution.updateOutput(SpatialConvolution.scala:54)
at com.intel.analytics.bigdl.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:257)
at com.intel.analytics.bigdl.nn.StaticGraph.updateOutput(StaticGraph.scala:59)
at com.intel.analytics.bigdl.nn.abstractnn.AbstractModule.forward(AbstractModule.scala:257)
at com.intel.analytics.zoo.pipeline.nnframes.NNModel$$anonfun$2$$anonfun$apply$1$$anonfun$4.apply(NNEstimator.scala:531)
There is a bug in BigDL 0.6 and fixed in 0.7; please try Analytics Zoo 0.3.0 with BigDL 0.7.1 (https://analytics-zoo.github.io/master/#release-download/#release-030)
We met errors when using examples/imageclassification/Predict.scala to predict inception v1 w/ ImageNet val. But it reported
java.lang.IllegalArgumentException
for 10k images andjava.lang.ArrayIndexOutOfBoundsException
for 5k images. Predicting 1000 images can pass.Execution script:
error when predicting 10000 images:
error when predicting 5000 images: