Intel-bigdata / HiBench

HiBench is a big data benchmark suite.
Other
1.45k stars 761 forks source link

UnicodeDecodeError: 'utf8' codec can't decode byte 0xfd: ERROR: terasort/spark/python failed to run successfully. #155

Open ghost opened 8 years ago

ghost commented 8 years ago

15/11/26 23:58:16 ERROR executor.Executor: Exception in task 0.0 in stage 1.0 (TID 1) org.apache.spark.api.python.PythonException: Traceback (most recent call last): UnicodeDecodeError: 'utf8' codec can't decode byte 0xfd in position 12: invalid start byte 15/11/26 23:58:16 WARN scheduler.TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): UnicodeDecodeError: 'utf8' codec can't decode byte 0xfd in position 12: invalid start byte 15/11/26 23:58:16 ERROR scheduler.TaskSetManager: Task 0 in stage 1.0 failed 1 times; aborting job py4j.protocol.Py4JJavaError15/11/26 23:58:16 WARN python.PythonRunner: Incomplete task interrupted: Attempting to kill Python Worker 15/11/26 23:58:16 WARN scheduler.TaskSetManager: Lost task 1.0 in stage 1.0 (TID 2, localhost): TaskKilled (killed intentionally) : An error occurred while calling z:org.apache.spark.api.python.PythonRDD.collectAndServe. : org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 1.0 failed 1 times, most recent failure: Lost task 0.0 in stage 1.0 (TID 1, localhost): org.apache.spark.api.python.PythonException: Traceback (most recent call last): UnicodeDecodeError: 'utf8' codec can't decode byte 0xfd in position 12: invalid start byte Caused by: org.apache.spark.api.python.PythonException: Traceback (most recent call last): UnicodeDecodeError: 'utf8' codec can't decode byte 0xfd in position 12: invalid start byte 15/11/26 23:58:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. ERROR: Spark job /home/hduser/HiBench/src/sparkbench/src/main/python/terasort.py failed to run successfully. Hint: You can goto /home/hduser/HiBench/report/terasort/spark/python/conf/../bench.log to check for detailed log. Opening log tail for you:

15/11/26 23:58:16 INFO storage.BlockManager: BlockManager stopped 15/11/26 23:58:16 INFO storage.BlockManagerMaster: BlockManagerMaster stopped 15/11/26 23:58:16 INFO scheduler.OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 15/11/26 23:58:16 INFO spark.SparkContext: Successfully stopped SparkContext 15/11/26 23:58:16 INFO util.ShutdownHookManager: Shutdown hook called 15/11/26 23:58:16 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-e49b1f0f-5b8e-429c-a57e-008007740bb6/pyspark-e0687dce-a34d-451c-aa1a-dceabb4b7288 15/11/26 23:58:16 INFO util.ShutdownHookManager: Deleting directory /tmp/spark-e49b1f0f-5b8e-429c-a57e-008007740bb6 15/11/26 23:58:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Shutting down remote daemon. 15/11/26 23:58:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remote daemon shut down; proceeding with flushing remote transports. 15/11/26 23:58:16 INFO remote.RemoteActorRefProvider$RemotingTerminator: Remoting shut down. ERROR: terasort/spark/python failed to run successfully.

carsonwang commented 8 years ago

Does this occur consistently? Which HiBench and Spark version are you using?

ghost commented 8 years ago

HiBench 5.0 and Spark 1.5.1. Once i got error in Terasort remaining other reports like nutchindexing, dfsioe, streamingbench were also not generated. Yet i got aggregation, bayes, join, kmeans, pagerank, scan, sleep, sort, sort and wordcount reports.