Closed ji-xin closed 5 years ago
From the current setting, I guess it is 2.
Tested it with broadcasting StanfordCoreNLP
, failed.
➜ cat run.sh
spark-submit --class ca.uwaterloo.cs651.project.CoreNLP --conf spark.executor.heartbeatInterval=20s --conf spark.network.timeout=600s --driver-memory 6G --executor-memory 20G target/project-1.0.jar -input sampledata -output output -functionality tokenize,cleanxml,ssplit,pos,lemma,ner,parse,depparse,sentiment,natlog,coref,relation
➜ wc -l sampledata
3 sampledata
time ./run.sh
./run.sh 343.36s user 31.80s system 421% cpu 1:29.07 total
➜ for i in `seq 1 100` do
for> cat sampledata >> bigdata
➜ wc -l bigdata
303 bigdata
time ./run.sh # in run.sh sampledata is changed into bigdata
CPU time limit exceeded
./run.sh 3527.43s user 75.91s system 120% cpu 49:46.18 total
It sucks.
Possible solutions:
mapPartitions
instead of map
class SerializableStanfordCoreNLP extends StanfordCoreNLP implements java.io.Serializable
The first one seems to be the easiest. Let me try it.
After some research, I am in the third solution. :(
Using solution 1
spark-submit --class ca.uwaterloo.cs651.project.CoreNLP --conf spark.executor.heartbeatInterval=1s --conf spark.network.timeout=2s --driver-memory 6G --executor-memory 20G target/project-1.0.jar -input sampledata -output output -functionality tokenize,ssplit,pos,ner,parse
sampledata
./run.sh 1:05.67
bigdata
./run.sh 46:41.54
Looks not bad.
One thing is for sure: things such as 2018-11-21 14:08:38 INFO StanfordCoreNLP:88 - Adding annotator tokenize
now only appear once for each functionalities.
When I try to run this on the datasci cluster, I encounter the following problem
2018-11-21 19:43:22 WARN NativeCodeLoader:62 - Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
2018-11-21 19:43:23 INFO CoreNLP:137 - Tool: CoreNLP
2018-11-21 19:43:23 INFO CoreNLP:138 - input path: sampledata
2018-11-21 19:43:23 INFO CoreNLP:139 - output path: output
2018-11-21 19:43:23 INFO CoreNLP:140 - - functionalities: tokenize,ssplit,pos,ner,parse
2018-11-21 19:43:23 INFO SparkContext:54 - Running Spark version 2.3.1
2018-11-21 19:43:23 INFO SparkContext:54 - Submitted application: CoreNLP
2018-11-21 19:43:23 INFO SecurityManager:54 - Changing view acls to: j9xin
2018-11-21 19:43:23 INFO SecurityManager:54 - Changing modify acls to: j9xin
2018-11-21 19:43:23 INFO SecurityManager:54 - Changing view acls groups to:
2018-11-21 19:43:23 INFO SecurityManager:54 - Changing modify acls groups to:
2018-11-21 19:43:23 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(j9xin); groups with view permissions: Set(); users with modify permissions: Set(j9xin); groups with modify permissions: Set()
2018-11-21 19:43:23 INFO Utils:54 - Successfully started service 'sparkDriver' on port 43661.
2018-11-21 19:43:23 INFO SparkEnv:54 - Registering MapOutputTracker
2018-11-21 19:43:23 INFO SparkEnv:54 - Registering BlockManagerMaster
2018-11-21 19:43:23 INFO BlockManagerMasterEndpoint:54 - Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
2018-11-21 19:43:23 INFO BlockManagerMasterEndpoint:54 - BlockManagerMasterEndpoint up
2018-11-21 19:43:23 INFO DiskBlockManager:54 - Created local directory at /tmp/blockmgr-8a36b07b-fc5c-4cc0-9451-61f4660635b4
2018-11-21 19:43:23 INFO MemoryStore:54 - MemoryStore started with capacity 3.0 GB
2018-11-21 19:43:23 INFO SparkEnv:54 - Registering OutputCommitCoordinator
2018-11-21 19:43:23 INFO log:192 - Logging initialized @1969ms
2018-11-21 19:43:23 INFO Server:346 - jetty-9.3.z-SNAPSHOT
2018-11-21 19:43:23 INFO Server:414 - Started @2036ms
2018-11-21 19:43:23 WARN Utils:66 - Service 'SparkUI' could not bind on port 4040. Attempting port 4041.
2018-11-21 19:43:23 WARN Utils:66 - Service 'SparkUI' could not bind on port 4041. Attempting port 4042.
2018-11-21 19:43:23 WARN Utils:66 - Service 'SparkUI' could not bind on port 4042. Attempting port 4043.
2018-11-21 19:43:23 INFO AbstractConnector:278 - Started ServerConnector@26f143ed{HTTP/1.1,[http/1.1]}{0.0.0.0:4043}
2018-11-21 19:43:23 INFO Utils:54 - Successfully started service 'SparkUI' on port 4043.
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1bb9aa43{/jobs,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@e27ba81{/jobs/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@54336c81{/jobs/job,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@35e52059{/jobs/job/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@62577d6{/stages,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49bd54f7{/stages/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6b5f8707{/stages/stage,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@79ab3a71{/stages/stage/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6e5bfdfc{/stages/pool,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3d829787{/stages/pool/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@71652c98{/storage,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@51bde877{/storage/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@60b85ba1{/storage/rdd,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@492fc69e{/storage/rdd/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@117632cf{/environment,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2fb68ec6{/environment/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@d71adc2{/executors,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@3add81c4{/executors/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1a1d3c1a{/executors/threadDump,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@1c65121{/executors/threadDump/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@159e366{/static,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@49bf29c6{/,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@7ee55e70{/api,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@126be319{/jobs/job/kill,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@6c44052e{/stages/stage/kill,null,AVAILABLE,@Spark}
2018-11-21 19:43:23 INFO SparkUI:54 - Bound SparkUI to 0.0.0.0, and started at http://datasci.datasci-domain.cs.uwaterloo.ca:4043
2018-11-21 19:43:23 INFO SparkContext:54 - Added JAR file:/u5/j9xin/Spark-CoreNLP/target/project-1.0.jar at spark://datasci.datasci-domain.cs.uwaterloo.ca:43661/jars/project-1.0.jar with timestamp 1542847403868
2018-11-21 19:43:24 INFO RMProxy:98 - Connecting to ResourceManager at datasci.datasci-domain.cs.uwaterloo.ca/192.168.167.1:8032
2018-11-21 19:43:24 INFO Client:54 - Requesting a new application from cluster with 5 NodeManagers
2018-11-21 19:43:24 INFO Client:54 - Verifying our application has not requested more than the maximum memory capability of the cluster (32768 MB per container)
2018-11-21 19:43:24 INFO Client:54 - Will allocate AM container, with 896 MB memory including 384 MB overhead
2018-11-21 19:43:24 INFO Client:54 - Setting up container launch context for our AM
2018-11-21 19:43:24 INFO Client:54 - Setting up the launch environment for our AM container
2018-11-21 19:43:24 INFO Client:54 - Preparing resources for our AM container
2018-11-21 19:43:25 WARN Client:66 - Neither spark.yarn.jars nor spark.yarn.archive is set, falling back to uploading libraries under SPARK_HOME.
2018-11-21 19:43:27 INFO Client:54 - Uploading resource file:/tmp/spark-43c47642-9007-4468-a402-8305e01479d3/__spark_libs__826626117090988318.zip -> hdfs://name1.datasci-domain.cs.uwaterloo.ca:8020/user/j9xin/.sparkStaging/application_1542745215044_1861/__spark_libs__826626117090988318.zip
2018-11-21 19:43:35 INFO Client:54 - Uploading resource file:/tmp/spark-43c47642-9007-4468-a402-8305e01479d3/__spark_conf__4769921634370175314.zip -> hdfs://name1.datasci-domain.cs.uwaterloo.ca:8020/user/j9xin/.sparkStaging/application_1542745215044_1861/__spark_conf__.zip
2018-11-21 19:43:35 INFO SecurityManager:54 - Changing view acls to: j9xin
2018-11-21 19:43:35 INFO SecurityManager:54 - Changing modify acls to: j9xin
2018-11-21 19:43:35 INFO SecurityManager:54 - Changing view acls groups to:
2018-11-21 19:43:35 INFO SecurityManager:54 - Changing modify acls groups to:
2018-11-21 19:43:35 INFO SecurityManager:54 - SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(j9xin); groups with view permissions: Set(); users with modify permissions: Set(j9xin); groups with modify permissions: Set()
2018-11-21 19:43:35 INFO Client:54 - Submitting application application_1542745215044_1861 to ResourceManager
2018-11-21 19:43:35 INFO YarnClientImpl:273 - Submitted application application_1542745215044_1861
2018-11-21 19:43:35 INFO SchedulerExtensionServices:54 - Starting Yarn extension services with app application_1542745215044_1861 and attemptId None
2018-11-21 19:43:36 INFO Client:54 - Application report for application_1542745215044_1861 (state: ACCEPTED)
2018-11-21 19:43:36 INFO Client:54 -
client token: N/A
diagnostics: AM container is launched, waiting for AM container to Register with RM
ApplicationMaster host: N/A
ApplicationMaster RPC port: -1
queue: default
start time: 1542847415598
final status: UNDEFINED
tracking URL: http://192.168.167.1:8089/proxy/application_1542745215044_1861/
user: j9xin
2018-11-21 19:43:37 INFO Client:54 - Application report for application_1542745215044_1861 (state: ACCEPTED)
2018-11-21 19:43:38 INFO Client:54 - Application report for application_1542745215044_1861 (state: ACCEPTED)
2018-11-21 19:43:39 INFO Client:54 - Application report for application_1542745215044_1861 (state: ACCEPTED)
2018-11-21 19:43:39 INFO YarnClientSchedulerBackend:54 - Add WebUI Filter. org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter, Map(PROXY_HOSTS -> datasci.datasci-domain.cs.uwaterloo.ca, PROXY_URI_BASES -> http://datasci.datasci-domain.cs.uwaterloo.ca:8088/proxy/application_1542745215044_1861), /proxy/application_1542745215044_1861
2018-11-21 19:43:39 INFO JettyUtils:54 - Adding filter: org.apache.hadoop.yarn.server.webproxy.amfilter.AmIpFilter
2018-11-21 19:43:40 INFO YarnSchedulerBackend$YarnSchedulerEndpoint:54 - ApplicationMaster registered as NettyRpcEndpointRef(spark-client://YarnAM)
2018-11-21 19:43:40 INFO Client:54 - Application report for application_1542745215044_1861 (state: RUNNING)
2018-11-21 19:43:40 INFO Client:54 -
client token: N/A
diagnostics: N/A
ApplicationMaster host: 192.168.167.102
ApplicationMaster RPC port: 0
queue: default
start time: 1542847415598
final status: UNDEFINED
tracking URL: http://192.168.167.1:8089/proxy/application_1542745215044_1861/
user: j9xin
2018-11-21 19:43:40 INFO YarnClientSchedulerBackend:54 - Application application_1542745215044_1861 has started running.
2018-11-21 19:43:40 INFO Utils:54 - Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35221.
2018-11-21 19:43:40 INFO NettyBlockTransferService:54 - Server created on datasci.datasci-domain.cs.uwaterloo.ca:35221
2018-11-21 19:43:40 INFO BlockManager:54 - Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
2018-11-21 19:43:40 INFO BlockManagerMaster:54 - Registering BlockManager BlockManagerId(driver, datasci.datasci-domain.cs.uwaterloo.ca, 35221, None)
2018-11-21 19:43:40 INFO BlockManagerMasterEndpoint:54 - Registering block manager datasci.datasci-domain.cs.uwaterloo.ca:35221 with 3.0 GB RAM, BlockManagerId(driver, datasci.datasci-domain.cs.uwaterloo.ca, 35221, None)
2018-11-21 19:43:40 INFO BlockManagerMaster:54 - Registered BlockManager BlockManagerId(driver, datasci.datasci-domain.cs.uwaterloo.ca, 35221, None)
2018-11-21 19:43:40 INFO BlockManager:54 - Initialized BlockManager: BlockManagerId(driver, datasci.datasci-domain.cs.uwaterloo.ca, 35221, None)
2018-11-21 19:43:40 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2dfa02c1{/metrics/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:44 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.167.101:56738) with ID 2
2018-11-21 19:43:44 INFO BlockManagerMasterEndpoint:54 - Registering block manager hadoop1.datasci-domain.cs.uwaterloo.ca:43567 with 10.5 GB RAM, BlockManagerId(2, hadoop1.datasci-domain.cs.uwaterloo.ca, 43567, None)
2018-11-21 19:43:45 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Registered executor NettyRpcEndpointRef(spark-client://Executor) (192.168.167.103:39686) with ID 1
2018-11-21 19:43:45 INFO YarnClientSchedulerBackend:54 - SchedulerBackend is ready for scheduling beginning after reached minRegisteredResourcesRatio: 0.8
2018-11-21 19:43:45 INFO MemoryStore:54 - Block broadcast_0 stored as values in memory (estimated size 648.0 B, free 3.0 GB)
2018-11-21 19:43:45 INFO BlockManagerMasterEndpoint:54 - Registering block manager hadoop3.datasci-domain.cs.uwaterloo.ca:37459 with 10.5 GB RAM, BlockManagerId(1, hadoop3.datasci-domain.cs.uwaterloo.ca, 37459, None)
2018-11-21 19:43:45 INFO MemoryStore:54 - Block broadcast_0_piece0 stored as bytes in memory (estimated size 284.0 B, free 3.0 GB)
2018-11-21 19:43:45 INFO BlockManagerInfo:54 - Added broadcast_0_piece0 in memory on datasci.datasci-domain.cs.uwaterloo.ca:35221 (size: 284.0 B, free: 3.0 GB)
2018-11-21 19:43:45 INFO SparkContext:54 - Created broadcast 0 from broadcast at CoreNLP.java:158
2018-11-21 19:43:45 INFO SharedState:54 - Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/u5/j9xin/Spark-CoreNLP/spark-warehouse/').
2018-11-21 19:43:45 INFO SharedState:54 - Warehouse path is 'file:/u5/j9xin/Spark-CoreNLP/spark-warehouse/'.
2018-11-21 19:43:45 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@71789580{/SQL,null,AVAILABLE,@Spark}
2018-11-21 19:43:45 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@23ee2ccf{/SQL/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:45 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@9d3d54e{/SQL/execution,null,AVAILABLE,@Spark}
2018-11-21 19:43:45 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@2f04993d{/SQL/execution/json,null,AVAILABLE,@Spark}
2018-11-21 19:43:45 INFO ContextHandler:781 - Started o.s.j.s.ServletContextHandler@16f15ae9{/static/sql,null,AVAILABLE,@Spark}
2018-11-21 19:43:45 INFO StateStoreCoordinatorRef:54 - Registered StateStoreCoordinator endpoint
2018-11-21 19:43:46 INFO FileSourceStrategy:54 - Pruning directories with:
2018-11-21 19:43:46 INFO FileSourceStrategy:54 - Post-Scan Filters:
2018-11-21 19:43:46 INFO FileSourceStrategy:54 - Output Data Schema: struct<value: string>
2018-11-21 19:43:46 INFO FileSourceScanExec:54 - Pushed Filters:
2018-11-21 19:43:47 INFO CodeGenerator:54 - Code generated in 241.781555 ms
2018-11-21 19:43:47 INFO MemoryStore:54 - Block broadcast_1 stored as values in memory (estimated size 292.9 KB, free 3.0 GB)
2018-11-21 19:43:47 INFO MemoryStore:54 - Block broadcast_1_piece0 stored as bytes in memory (estimated size 25.0 KB, free 3.0 GB)
2018-11-21 19:43:47 INFO BlockManagerInfo:54 - Added broadcast_1_piece0 in memory on datasci.datasci-domain.cs.uwaterloo.ca:35221 (size: 25.0 KB, free: 3.0 GB)
2018-11-21 19:43:47 INFO SparkContext:54 - Created broadcast 1 from javaRDD at CoreNLP.java:160
2018-11-21 19:43:47 INFO FileSourceScanExec:54 - Planning scan with bin packing, max size: 4194304 bytes, open cost is considered as scanning 4194304 bytes.
2018-11-21 19:43:47 INFO deprecation:1173 - mapred.output.dir is deprecated. Instead, use mapreduce.output.fileoutputformat.outputdir
2018-11-21 19:43:47 INFO FileOutputCommitter:108 - File Output Committer Algorithm version is 1
2018-11-21 19:43:47 INFO SparkContext:54 - Starting job: runJob at SparkHadoopWriter.scala:78
2018-11-21 19:43:47 INFO DAGScheduler:54 - Registering RDD 5 (mapPartitionsToPair at CoreNLP.java:164)
2018-11-21 19:43:47 INFO DAGScheduler:54 - Got job 0 (runJob at SparkHadoopWriter.scala:78) with 5 output partitions
2018-11-21 19:43:47 INFO DAGScheduler:54 - Final stage: ResultStage 1 (runJob at SparkHadoopWriter.scala:78)
2018-11-21 19:43:47 INFO DAGScheduler:54 - Parents of final stage: List(ShuffleMapStage 0)
2018-11-21 19:43:47 INFO DAGScheduler:54 - Missing parents: List(ShuffleMapStage 0)
2018-11-21 19:43:47 INFO DAGScheduler:54 - Submitting ShuffleMapStage 0 (MapPartitionsRDD[5] at mapPartitionsToPair at CoreNLP.java:164), which has no missing parents
2018-11-21 19:43:47 INFO MemoryStore:54 - Block broadcast_2 stored as values in memory (estimated size 12.1 KB, free 3.0 GB)
2018-11-21 19:43:47 INFO MemoryStore:54 - Block broadcast_2_piece0 stored as bytes in memory (estimated size 6.5 KB, free 3.0 GB)
2018-11-21 19:43:47 INFO BlockManagerInfo:54 - Added broadcast_2_piece0 in memory on datasci.datasci-domain.cs.uwaterloo.ca:35221 (size: 6.5 KB, free: 3.0 GB)
2018-11-21 19:43:47 INFO SparkContext:54 - Created broadcast 2 from broadcast at DAGScheduler.scala:1039
2018-11-21 19:43:47 INFO DAGScheduler:54 - Submitting 1 missing tasks from ShuffleMapStage 0 (MapPartitionsRDD[5] at mapPartitionsToPair at CoreNLP.java:164) (first 15 tasks are for partitions Vector(0))
2018-11-21 19:43:47 INFO YarnScheduler:54 - Adding task set 0.0 with 1 tasks
2018-11-21 19:43:47 INFO TaskSetManager:54 - Starting task 0.0 in stage 0.0 (TID 0, hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2, partition 0, NODE_LOCAL, 8445 bytes)
2018-11-21 19:43:49 ERROR TransportRequestHandler:226 - Error sending result StreamResponse{streamId=/jars/project-1.0.jar, byteCount=553001604, body=FileSegmentManagedBuffer{file=/u5/j9xin/Spark-CoreNLP/target/project-1.0.jar, offset=0, length=553001604}} to /192.168.167.101:56740; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at io.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:145)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:368)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:639)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2018-11-21 19:43:49 WARN TaskSetManager:66 - Lost task 0.0 in stage 0.0 (TID 0, hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2): java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2018-11-21 19:43:49 INFO TaskSetManager:54 - Starting task 0.1 in stage 0.0 (TID 1, hadoop3.datasci-domain.cs.uwaterloo.ca, executor 1, partition 0, NODE_LOCAL, 8445 bytes)
2018-11-21 19:43:51 ERROR TransportRequestHandler:226 - Error sending result StreamResponse{streamId=/jars/project-1.0.jar, byteCount=553001604, body=FileSegmentManagedBuffer{file=/u5/j9xin/Spark-CoreNLP/target/project-1.0.jar, offset=0, length=553001604}} to /192.168.167.103:39688; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at io.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:145)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:368)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:639)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2018-11-21 19:43:52 INFO TaskSetManager:54 - Lost task 0.1 in stage 0.0 (TID 1) on hadoop3.datasci-domain.cs.uwaterloo.ca, executor 1: java.nio.channels.ClosedChannelException (null) [duplicate 1]
2018-11-21 19:43:52 INFO TaskSetManager:54 - Starting task 0.2 in stage 0.0 (TID 2, hadoop3.datasci-domain.cs.uwaterloo.ca, executor 1, partition 0, NODE_LOCAL, 8445 bytes)
2018-11-21 19:43:54 ERROR TransportRequestHandler:226 - Error sending result StreamResponse{streamId=/jars/project-1.0.jar, byteCount=553001604, body=FileSegmentManagedBuffer{file=/u5/j9xin/Spark-CoreNLP/target/project-1.0.jar, offset=0, length=553001604}} to /192.168.167.103:39690; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at io.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:145)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:368)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:639)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2018-11-21 19:43:54 INFO TaskSetManager:54 - Lost task 0.2 in stage 0.0 (TID 2) on hadoop3.datasci-domain.cs.uwaterloo.ca, executor 1: java.nio.channels.ClosedChannelException (null) [duplicate 2]
2018-11-21 19:43:54 INFO TaskSetManager:54 - Starting task 0.3 in stage 0.0 (TID 3, hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2, partition 0, NODE_LOCAL, 8445 bytes)
2018-11-21 19:43:56 ERROR TransportRequestHandler:226 - Error sending result StreamResponse{streamId=/jars/project-1.0.jar, byteCount=553001604, body=FileSegmentManagedBuffer{file=/u5/j9xin/Spark-CoreNLP/target/project-1.0.jar, offset=0, length=553001604}} to /192.168.167.101:56744; closing connection
java.io.IOException: Connection reset by peer
at sun.nio.ch.FileChannelImpl.transferTo0(Native Method)
at sun.nio.ch.FileChannelImpl.transferToDirectlyInternal(FileChannelImpl.java:428)
at sun.nio.ch.FileChannelImpl.transferToDirectly(FileChannelImpl.java:493)
at sun.nio.ch.FileChannelImpl.transferTo(FileChannelImpl.java:608)
at io.netty.channel.DefaultFileRegion.transferTo(DefaultFileRegion.java:145)
at org.apache.spark.network.protocol.MessageWithHeader.transferTo(MessageWithHeader.java:121)
at io.netty.channel.socket.nio.NioSocketChannel.doWriteFileRegion(NioSocketChannel.java:355)
at io.netty.channel.nio.AbstractNioByteChannel.doWrite(AbstractNioByteChannel.java:224)
at io.netty.channel.socket.nio.NioSocketChannel.doWrite(NioSocketChannel.java:382)
at io.netty.channel.AbstractChannel$AbstractUnsafe.flush0(AbstractChannel.java:934)
at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.forceFlush(AbstractNioChannel.java:368)
at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:639)
at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:580)
at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:497)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:459)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2018-11-21 19:43:56 INFO TaskSetManager:54 - Lost task 0.3 in stage 0.0 (TID 3) on hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2: java.nio.channels.ClosedChannelException (null) [duplicate 3]
2018-11-21 19:43:56 ERROR TaskSetManager:70 - Task 0 in stage 0.0 failed 4 times; aborting job
2018-11-21 19:43:56 INFO YarnScheduler:54 - Removed TaskSet 0.0, whose tasks have all completed, from pool
2018-11-21 19:43:56 INFO YarnScheduler:54 - Cancelling stage 0
2018-11-21 19:43:56 INFO DAGScheduler:54 - ShuffleMapStage 0 (mapPartitionsToPair at CoreNLP.java:164) failed in 8.747 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2): java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
2018-11-21 19:43:56 INFO DAGScheduler:54 - Job 0 failed: runJob at SparkHadoopWriter.scala:78, took 8.796281 s
2018-11-21 19:43:56 ERROR SparkHadoopWriter:91 - Aborting job job_20181121194347_0008.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2): java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1096)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:1067)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply$mcV$sp(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:957)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply$mcV$sp(RDD.scala:1493)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1472)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1472)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1472)
at org.apache.spark.api.java.JavaRDDLike$class.saveAsTextFile(JavaRDDLike.scala:550)
at org.apache.spark.api.java.AbstractJavaRDDLike.saveAsTextFile(JavaRDDLike.scala:45)
at ca.uwaterloo.cs651.project.CoreNLP.main(CoreNLP.java:330)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Exception in thread "main" org.apache.spark.SparkException: Job aborted.
at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:96)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply$mcV$sp(PairRDDFunctions.scala:1096)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopDataset$1.apply(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopDataset(PairRDDFunctions.scala:1094)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply$mcV$sp(PairRDDFunctions.scala:1067)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$4.apply(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:1032)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply$mcV$sp(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.PairRDDFunctions$$anonfun$saveAsHadoopFile$1.apply(PairRDDFunctions.scala:958)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.PairRDDFunctions.saveAsHadoopFile(PairRDDFunctions.scala:957)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply$mcV$sp(RDD.scala:1493)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1472)
at org.apache.spark.rdd.RDD$$anonfun$saveAsTextFile$1.apply(RDD.scala:1472)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:112)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:363)
at org.apache.spark.rdd.RDD.saveAsTextFile(RDD.scala:1472)
at org.apache.spark.api.java.JavaRDDLike$class.saveAsTextFile(JavaRDDLike.scala:550)
at org.apache.spark.api.java.AbstractJavaRDDLike.saveAsTextFile(JavaRDDLike.scala:45)
at ca.uwaterloo.cs651.project.CoreNLP.main(CoreNLP.java:330)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:894)
at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:198)
at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:228)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:137)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 4 times, most recent failure: Lost task 0.3 in stage 0.0 (TID 3, hadoop1.datasci-domain.cs.uwaterloo.ca, executor 2): java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
Driver stacktrace:
at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1602)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1590)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1589)
at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1589)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:831)
at scala.Option.foreach(Option.scala:257)
at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:831)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1823)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1772)
at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1761)
at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:642)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2034)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2055)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:2087)
at org.apache.spark.internal.io.SparkHadoopWriter$.write(SparkHadoopWriter.scala:78)
... 41 more
Caused by: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.StreamInterceptor.channelInactive(StreamInterceptor.java:60)
at org.apache.spark.network.util.TransportFrameDecoder.channelInactive(TransportFrameDecoder.java:179)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelInactive(AbstractChannelHandlerContext.java:224)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelInactive(DefaultChannelPipeline.java:1354)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:245)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelInactive(AbstractChannelHandlerContext.java:231)
at io.netty.channel.DefaultChannelPipeline.fireChannelInactive(DefaultChannelPipeline.java:917)
at io.netty.channel.AbstractChannel$AbstractUnsafe$8.run(AbstractChannel.java:822)
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:163)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:403)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:463)
at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:858)
at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:138)
at java.lang.Thread.run(Thread.java:748)
2018-11-21 19:43:56 INFO SparkContext:54 - Invoking stop() from shutdown hook
2018-11-21 19:43:56 INFO AbstractConnector:318 - Stopped Spark@26f143ed{HTTP/1.1,[http/1.1]}{0.0.0.0:4043}
2018-11-21 19:43:56 INFO SparkUI:54 - Stopped Spark web UI at http://datasci.datasci-domain.cs.uwaterloo.ca:4043
2018-11-21 19:43:56 INFO YarnClientSchedulerBackend:54 - Interrupting monitor thread
2018-11-21 19:43:56 INFO YarnClientSchedulerBackend:54 - Shutting down all executors
2018-11-21 19:43:56 INFO YarnSchedulerBackend$YarnDriverEndpoint:54 - Asking each executor to shut down
2018-11-21 19:43:56 INFO SchedulerExtensionServices:54 - Stopping SchedulerExtensionServices
(serviceOption=None,
services=List(),
started=false)
2018-11-21 19:43:56 INFO YarnClientSchedulerBackend:54 - Stopped
2018-11-21 19:43:56 INFO MapOutputTrackerMasterEndpoint:54 - MapOutputTrackerMasterEndpoint stopped!
2018-11-21 19:43:56 INFO MemoryStore:54 - MemoryStore cleared
2018-11-21 19:43:56 INFO BlockManager:54 - BlockManager stopped
2018-11-21 19:43:56 INFO BlockManagerMaster:54 - BlockManagerMaster stopped
2018-11-21 19:43:56 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint:54 - OutputCommitCoordinator stopped!
2018-11-21 19:43:56 INFO SparkContext:54 - Successfully stopped SparkContext
2018-11-21 19:43:56 INFO ShutdownHookManager:54 - Shutdown hook called
2018-11-21 19:43:56 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-69182963-585e-4aea-82f9-67f5fb78cae5
2018-11-21 19:43:56 INFO ShutdownHookManager:54 - Deleting directory /tmp/spark-43c47642-9007-4468-a402-8305e01479d3
On datasci cluster
The ratio is acceptable (~30, yet the dataset is 100 times larger). However I don't know how to enable multiple mapper.
A possible solution is that we can divide one "big data" file into several small files manually and submit multiple parallel spark jobs to deal with these small files. And these spark jobs can be managed by a thread pool.
What do you think?
Let me try to manually designate numOfPartitions in the source code tomorrow.
Result: run with --num-executors 4 --executor-cores 4 -mappers 4
: 13min57sec
We are not sure which of the following is actually done by Spark:
StanforCoreNLP
object at the initialization of a mapper, then use it to map each inputStanfordCoreNLP
object for each inputIf 2 is the case, the efficiency will be unbearable.
Experiments are running at the moment to check which is true.