CoNLL processing exception

fmarten / JoSimText

A system for word sense induction and disambiguation based on JoBimText approach

0 stars 0 forks source link

Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/08/18 23:09:13 INFO SparkContext: Running Spark version 2.2.0 17/08/18 23:09:14 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/08/18 23:09:19 INFO SparkContext: Submitted application: CoNLL2DepTermContext$ 17/08/18 23:09:19 INFO SecurityManager: Changing view acls to: panchenko 17/08/18 23:09:19 INFO SecurityManager: Changing modify acls to: panchenko 17/08/18 23:09:19 INFO SecurityManager: Changing view acls groups to: 17/08/18 23:09:19 INFO SecurityManager: Changing modify acls groups to: 17/08/18 23:09:19 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(panchenko); groups with view permissions: Set(); users with modify permissions: Set(panchenko); groups with modify permissions: Set() 17/08/18 23:09:19 INFO Utils: Successfully started service 'sparkDriver' on port 56249. 17/08/18 23:09:19 INFO SparkEnv: Registering MapOutputTracker 17/08/18 23:09:19 INFO SparkEnv: Registering BlockManagerMaster 17/08/18 23:09:19 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/08/18 23:09:19 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/08/18 23:09:19 INFO DiskBlockManager: Created local directory at /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/blockmgr-1a908ae1-d1d7-4acb-8bbd-0b61aa43e047 17/08/18 23:09:19 INFO MemoryStore: MemoryStore started with capacity 4.1 GB 17/08/18 23:09:19 INFO SparkEnv: Registering OutputCommitCoordinator 17/08/18 23:09:19 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/08/18 23:09:20 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://10.0.1.3:4040 17/08/18 23:09:20 INFO SparkContext: Added JAR file:/Users/panchenko/Desktop/JoSimText/scripts/../target/scala-2.11/josimtext_2.11-0.4.jar at spark://10.0.1.3:56249/jars/josimtext_2.11-0.4.jar with timestamp 1503090560059 17/08/18 23:09:20 INFO Executor: Starting executor ID driver on host localhost 17/08/18 23:09:20 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 56250. 17/08/18 23:09:20 INFO NettyBlockTransferService: Server created on 10.0.1.3:56250 17/08/18 23:09:20 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/08/18 23:09:20 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 10.0.1.3, 56250, None) 17/08/18 23:09:20 INFO BlockManagerMasterEndpoint: Registering block manager 10.0.1.3:56250 with 4.1 GB RAM, BlockManagerId(driver, 10.0.1.3, 56250, None) 17/08/18 23:09:20 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 10.0.1.3, 56250, None) 17/08/18 23:09:20 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 10.0.1.3, 56250, None) 17/08/18 23:09:20 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/panchenko/Desktop/JoSimText/scripts/spark-warehouse/'). 17/08/18 23:09:20 INFO SharedState: Warehouse path is 'file:/Users/panchenko/Desktop/JoSimText/scripts/spark-warehouse/'. 17/08/18 23:09:21 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 17/08/18 23:09:23 INFO FileSourceStrategy: Pruning directories with: 17/08/18 23:09:23 INFO FileSourceStrategy: Post-Scan Filters: 17/08/18 23:09:23 INFO FileSourceStrategy: Output Data Schema: struct<value: string> 17/08/18 23:09:23 INFO FileSourceScanExec: Pushed Filters: 17/08/18 23:09:23 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/18 23:09:23 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/18 23:09:24 INFO CodeGenerator: Code generated in 158.486406 ms 17/08/18 23:09:24 INFO CodeGenerator: Code generated in 51.904684 ms 17/08/18 23:09:24 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 277.3 KB, free 4.1 GB) 17/08/18 23:09:24 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KB, free 4.1 GB) 17/08/18 23:09:24 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 10.0.1.3:56250 (size: 23.4 KB, free: 4.1 GB) 17/08/18 23:09:24 INFO SparkContext: Created broadcast 0 from broadcast at DefaultSource.scala:86 17/08/18 23:09:24 INFO FileSourceScanExec: Planning scan with bin packing, max size: 23280808 bytes, open cost is considered as scanning 4194304 bytes. 17/08/18 23:09:24 INFO SparkContext: Starting job: text at CoNLL2DepTermContext.scala:29 17/08/18 23:09:24 INFO DAGScheduler: Got job 0 (text at CoNLL2DepTermContext.scala:29) with 4 output partitions 17/08/18 23:09:24 INFO DAGScheduler: Final stage: ResultStage 0 (text at CoNLL2DepTermContext.scala:29) 17/08/18 23:09:24 INFO DAGScheduler: Parents of final stage: List() 17/08/18 23:09:24 INFO DAGScheduler: Missing parents: List() 17/08/18 23:09:24 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[4] at text at CoNLL2DepTermContext.scala:29), which has no missing parents 17/08/18 23:09:24 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 99.2 KB, free 4.1 GB) 17/08/18 23:09:24 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 35.6 KB, free 4.1 GB) 17/08/18 23:09:24 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 10.0.1.3:56250 (size: 35.6 KB, free: 4.1 GB) 17/08/18 23:09:24 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006 17/08/18 23:09:24 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (MapPartitionsRDD[4] at text at CoNLL2DepTermContext.scala:29) (first 15 tasks are for partitions Vector(0, 1, 2, 3)) 17/08/18 23:09:24 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks 17/08/18 23:09:24 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 5300 bytes) 17/08/18 23:09:24 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 5300 bytes) 17/08/18 23:09:24 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 5300 bytes) 17/08/18 23:09:24 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 5300 bytes) 17/08/18 23:09:24 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) 17/08/18 23:09:24 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 17/08/18 23:09:24 INFO Executor: Running task 3.0 in stage 0.0 (TID 3) 17/08/18 23:09:24 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 17/08/18 23:09:24 INFO Executor: Fetching spark://10.0.1.3:56249/jars/josimtext_2.11-0.4.jar with timestamp 1503090560059 17/08/18 23:09:25 INFO TransportClientFactory: Successfully created connection to /10.0.1.3:56249 after 46 ms (0 ms spent in bootstraps) 17/08/18 23:09:25 INFO Utils: Fetching spark://10.0.1.3:56249/jars/josimtext_2.11-0.4.jar to /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-5eea660b-5cc0-476a-8347-be8044dd897e/userFiles-21df9990-d05e-4469-b394-b397092a2323/fetchFileTemp5738246302582940000.tmp 17/08/18 23:09:25 INFO Executor: Adding file:/private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-5eea660b-5cc0-476a-8347-be8044dd897e/userFiles-21df9990-d05e-4469-b394-b397092a2323/josimtext_2.11-0.4.jar to class loader 17/08/18 23:09:25 INFO CodeGenerator: Code generated in 30.108062 ms 17/08/18 23:09:25 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/18 23:09:25 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/18 23:09:25 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/18 23:09:25 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/18 23:09:25 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/18 23:09:25 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/18 23:09:25 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/18 23:09:25 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/18 23:09:25 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample.csv, range: 0-23280808, partition values: [empty row] 17/08/18 23:09:25 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample.csv, range: 69842424-88928930, partition values: [empty row] 17/08/18 23:09:25 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample.csv, range: 23280808-46561616, partition values: [empty row] 17/08/18 23:09:25 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample.csv, range: 46561616-69842424, partition values: [empty row] 17/08/18 23:09:25 INFO CodeGenerator: Code generated in 13.221717 ms 17/08/18 23:09:25 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR FileFormatWriter: Job job_20170818230925_0000 aborted. 17/08/18 23:09:25 ERROR FileFormatWriter: Job job_20170818230925_0000 aborted. 17/08/18 23:09:25 ERROR FileFormatWriter: Job job_20170818230925_0000 aborted. 17/08/18 23:09:25 ERROR FileFormatWriter: Job job_20170818230925_0000 aborted. 17/08/18 23:09:25 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job 17/08/18 23:09:25 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1] 17/08/18 23:09:25 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2] 17/08/18 23:09:25 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/08/18 23:09:25 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 3] 17/08/18 23:09:25 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/08/18 23:09:25 INFO TaskSchedulerImpl: Cancelling stage 0 17/08/18 23:09:25 INFO DAGScheduler: ResultStage 0 (text at CoNLL2DepTermContext.scala:29) failed in 0.720 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Driver stacktrace: 17/08/18 23:09:25 INFO DAGScheduler: Job 0 failed: text at CoNLL2DepTermContext.scala:29, took 0.892415 s 17/08/18 23:09:25 ERROR FileFormatWriter: Aborting job null. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:438) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:474) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217) at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:555) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$.main(CoNLL2DepTermContext.scala:29) at de.uhh.lt.jst.dt.CoNLL2DepTermContext.main(CoNLL2DepTermContext.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Exception in thread "main" org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:438) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:474) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217) at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:555) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$.main(CoNLL2DepTermContext.scala:29) at de.uhh.lt.jst.dt.CoNLL2DepTermContext.main(CoNLL2DepTermContext.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) ... 45 more Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/18 23:09:25 INFO SparkContext: Invoking stop() from shutdown hook 17/08/18 23:09:25 INFO SparkUI: Stopped Spark web UI at http://10.0.1.3:4040 17/08/18 23:09:25 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/08/18 23:09:25 INFO MemoryStore: MemoryStore cleared 17/08/18 23:09:25 INFO BlockManager: BlockManager stopped 17/08/18 23:09:25 INFO BlockManagerMaster: BlockManagerMaster stopped 17/08/18 23:09:25 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/08/18 23:09:25 INFO SparkContext: Successfully stopped SparkContext 17/08/18 23:09:25 INFO ShutdownHookManager: Shutdown hook called 17/08/18 23:09:25 INFO ShutdownHookManager: Deleting directory /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-5eea660b-5cc0-476a-8347-be8044dd897e panchenko@Alexanders-MacBook-Pro:~/Desktop/JoSimText/scripts$

hadoop=hadoop hadoop_xmx_mb=8192 hadoop_mb=8000 spark=/Users/panchenko/Desktop/spark-2.2.0-bin-hadoop2.7/bin/spark-submit spark_gb=8 hadoop_conf_dir=/etc/hadoop/conf/ yarn_conf_dir=/etc/hadoop/conf.cloudera.yarn/ mwe_dict_path="voc/voc-mwe6446031-dbpedia-babelnet-wordnet-dela.csv" queue=default master=local[*] num_executors=4 bin_spark=`ls ../target/scala-*/jo*.jar` bin_hadoop="../bin/hadoop/"

panchenko@Alexanders-MacBook-Pro:~/Desktop/JoSimText/scripts$ bash dt_spark.sh conll ~/Desktop/test/cc16-conll-copp-sample-newlines.csv ~/Desktop/test/conll-output-5/ config/l.sh Format: conll Input: /Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv Output: /Users/panchenko/Desktop/test/conll-output-5/ To start press any key, to stop press Ctrl+C Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties 17/08/21 17:01:36 INFO SparkContext: Running Spark version 2.2.0 17/08/21 17:01:36 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 17/08/21 17:01:37 INFO SparkContext: Submitted application: CoNLL2DepTermContext$ 17/08/21 17:01:37 INFO SecurityManager: Changing view acls to: panchenko 17/08/21 17:01:37 INFO SecurityManager: Changing modify acls to: panchenko 17/08/21 17:01:37 INFO SecurityManager: Changing view acls groups to: 17/08/21 17:01:37 INFO SecurityManager: Changing modify acls groups to: 17/08/21 17:01:37 INFO SecurityManager: SecurityManager: authentication disabled; ui acls disabled; users with view permissions: Set(panchenko); groups with view permissions: Set(); users with modify permissions: Set(panchenko); groups with modify permissions: Set() 17/08/21 17:01:37 INFO Utils: Successfully started service 'sparkDriver' on port 63661. 17/08/21 17:01:37 INFO SparkEnv: Registering MapOutputTracker 17/08/21 17:01:37 INFO SparkEnv: Registering BlockManagerMaster 17/08/21 17:01:37 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information 17/08/21 17:01:37 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up 17/08/21 17:01:37 INFO DiskBlockManager: Created local directory at /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/blockmgr-f72fc11f-4fc4-4729-b510-f57e5a81ce09 17/08/21 17:01:37 INFO MemoryStore: MemoryStore started with capacity 4.1 GB 17/08/21 17:01:37 INFO SparkEnv: Registering OutputCommitCoordinator 17/08/21 17:01:37 INFO Utils: Successfully started service 'SparkUI' on port 4040. 17/08/21 17:01:37 INFO SparkUI: Bound SparkUI to 0.0.0.0, and started at http://134.100.15.237:4040 17/08/21 17:01:37 INFO SparkContext: Added JAR file:/Users/panchenko/Desktop/JoSimText/scripts/../target/scala-2.11/josimtext_2.11-0.4.jar at spark://134.100.15.237:63661/jars/josimtext_2.11-0.4.jar with timestamp 1503327697927 17/08/21 17:01:38 INFO Executor: Starting executor ID driver on host localhost 17/08/21 17:01:38 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 63662. 17/08/21 17:01:38 INFO NettyBlockTransferService: Server created on 134.100.15.237:63662 17/08/21 17:01:38 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy 17/08/21 17:01:38 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, 134.100.15.237, 63662, None) 17/08/21 17:01:38 INFO BlockManagerMasterEndpoint: Registering block manager 134.100.15.237:63662 with 4.1 GB RAM, BlockManagerId(driver, 134.100.15.237, 63662, None) 17/08/21 17:01:38 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, 134.100.15.237, 63662, None) 17/08/21 17:01:38 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, 134.100.15.237, 63662, None) 17/08/21 17:01:38 INFO SharedState: Setting hive.metastore.warehouse.dir ('null') to the value of spark.sql.warehouse.dir ('file:/Users/panchenko/Desktop/JoSimText/scripts/spark-warehouse/'). 17/08/21 17:01:38 INFO SharedState: Warehouse path is 'file:/Users/panchenko/Desktop/JoSimText/scripts/spark-warehouse/'. 17/08/21 17:01:39 INFO StateStoreCoordinatorRef: Registered StateStoreCoordinator endpoint 17/08/21 17:01:42 INFO FileSourceStrategy: Pruning directories with: 17/08/21 17:01:42 INFO FileSourceStrategy: Post-Scan Filters: 17/08/21 17:01:42 INFO FileSourceStrategy: Output Data Schema: struct<value: string> 17/08/21 17:01:42 INFO FileSourceScanExec: Pushed Filters: 17/08/21 17:01:42 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/21 17:01:42 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/21 17:01:42 INFO CodeGenerator: Code generated in 197.886328 ms 17/08/21 17:01:43 INFO CodeGenerator: Code generated in 69.710432 ms 17/08/21 17:01:43 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 277.4 KB, free 4.1 GB) 17/08/21 17:01:43 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KB, free 4.1 GB) 17/08/21 17:01:43 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 134.100.15.237:63662 (size: 23.4 KB, free: 4.1 GB) 17/08/21 17:01:43 INFO SparkContext: Created broadcast 0 from broadcast at DefaultSource.scala:86 17/08/21 17:01:43 INFO FileSourceScanExec: Planning scan with bin packing, max size: 23280808 bytes, open cost is considered as scanning 4194304 bytes. 17/08/21 17:01:43 INFO SparkContext: Starting job: text at CoNLL2DepTermContext.scala:29 17/08/21 17:01:43 INFO DAGScheduler: Got job 0 (text at CoNLL2DepTermContext.scala:29) with 4 output partitions 17/08/21 17:01:43 INFO DAGScheduler: Final stage: ResultStage 0 (text at CoNLL2DepTermContext.scala:29) 17/08/21 17:01:43 INFO DAGScheduler: Parents of final stage: List() 17/08/21 17:01:43 INFO DAGScheduler: Missing parents: List() 17/08/21 17:01:43 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[4] at text at CoNLL2DepTermContext.scala:29), which has no missing parents 17/08/21 17:01:43 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 99.2 KB, free 4.1 GB) 17/08/21 17:01:43 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 35.7 KB, free 4.1 GB) 17/08/21 17:01:43 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 134.100.15.237:63662 (size: 35.7 KB, free: 4.1 GB) 17/08/21 17:01:43 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006 17/08/21 17:01:43 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (MapPartitionsRDD[4] at text at CoNLL2DepTermContext.scala:29) (first 15 tasks are for partitions Vector(0, 1, 2, 3)) 17/08/21 17:01:43 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks 17/08/21 17:01:43 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 5309 bytes) 17/08/21 17:01:43 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 5309 bytes) 17/08/21 17:01:43 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 5309 bytes) 17/08/21 17:01:43 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 5309 bytes) 17/08/21 17:01:43 INFO Executor: Running task 0.0 in stage 0.0 (TID 0) 17/08/21 17:01:43 INFO Executor: Running task 3.0 in stage 0.0 (TID 3) 17/08/21 17:01:43 INFO Executor: Running task 1.0 in stage 0.0 (TID 1) 17/08/21 17:01:43 INFO Executor: Running task 2.0 in stage 0.0 (TID 2) 17/08/21 17:01:43 INFO Executor: Fetching spark://134.100.15.237:63661/jars/josimtext_2.11-0.4.jar with timestamp 1503327697927 17/08/21 17:01:43 INFO TransportClientFactory: Successfully created connection to /134.100.15.237:63661 after 46 ms (0 ms spent in bootstraps) 17/08/21 17:01:43 INFO Utils: Fetching spark://134.100.15.237:63661/jars/josimtext_2.11-0.4.jar to /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-e6c2d680-6e1e-4020-90b4-a63a9a29b698/userFiles-0aedfc8f-5af2-4344-a315-7cd0282f4163/fetchFileTemp6873320287201534938.tmp 17/08/21 17:01:43 INFO Executor: Adding file:/private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-e6c2d680-6e1e-4020-90b4-a63a9a29b698/userFiles-0aedfc8f-5af2-4344-a315-7cd0282f4163/josimtext_2.11-0.4.jar to class loader 17/08/21 17:01:44 INFO CodeGenerator: Code generated in 52.712156 ms 17/08/21 17:01:44 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/21 17:01:44 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/21 17:01:44 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/21 17:01:44 INFO FileOutputCommitter: File Output Committer Algorithm version is 1 17/08/21 17:01:44 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/21 17:01:44 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/21 17:01:44 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/21 17:01:44 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter 17/08/21 17:01:44 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 0-23280808, partition values: [empty row] 17/08/21 17:01:44 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 23280808-46561616, partition values: [empty row] 17/08/21 17:01:44 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 46561616-69842424, partition values: [empty row] 17/08/21 17:01:44 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 69842424-88928930, partition values: [empty row] 17/08/21 17:01:44 INFO CodeGenerator: Code generated in 15.486832 ms 17/08/21 17:01:44 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR FileFormatWriter: Job job_20170821170144_0000 aborted. 17/08/21 17:01:44 ERROR Utils: Aborting task org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR FileFormatWriter: Job job_20170821170144_0000 aborted. 17/08/21 17:01:44 ERROR FileFormatWriter: Job job_20170821170144_0000 aborted. 17/08/21 17:01:44 ERROR FileFormatWriter: Job job_20170821170144_0000 aborted. 17/08/21 17:01:44 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1) org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 WARN TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 ERROR TaskSetManager: Task 0 in stage 0.0 failed 1 times; aborting job 17/08/21 17:01:44 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1] 17/08/21 17:01:44 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2] 17/08/21 17:01:44 INFO TaskSchedulerImpl: Cancelling stage 0 17/08/21 17:01:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/08/21 17:01:44 INFO TaskSchedulerImpl: Stage 0 was cancelled 17/08/21 17:01:44 INFO TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 3] 17/08/21 17:01:44 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool 17/08/21 17:01:44 INFO DAGScheduler: ResultStage 0 (text at CoNLL2DepTermContext.scala:29) failed in 0.742 s due to Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Driver stacktrace: 17/08/21 17:01:44 INFO DAGScheduler: Job 0 failed: text at CoNLL2DepTermContext.scala:29, took 0.966458 s 17/08/21 17:01:44 ERROR FileFormatWriter: Aborting job null. org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:438) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:474) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217) at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:555) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$.main(CoNLL2DepTermContext.scala:29) at de.uhh.lt.jst.dt.CoNLL2DepTermContext.main(CoNLL2DepTermContext.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Exception in thread "main" org.apache.spark.SparkException: Job aborted. at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173) at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:145) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:438) at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:474) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58) at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56) at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117) at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138) at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151) at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135) at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116) at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92) at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217) at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:555) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$.main(CoNLL2DepTermContext.scala:29) at de.uhh.lt.jst.dt.CoNLL2DepTermContext.main(CoNLL2DepTermContext.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:498) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 0 in stage 0.0 failed 1 times, most recent failure: Lost task 0.0 in stage 0.0 (TID 0, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more Driver stacktrace: at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487) at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814) at scala.Option.foreach(Option.scala:257) at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630) at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188) ... 45 more Caused by: org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:108) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395) at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258) at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256) at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375) at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261) ... 8 more Caused by: java.lang.NullPointerException at com.univocity.parsers.common.LineReader.read(LineReader.java:51) at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75) at com.univocity.parsers.common.input.AbstractCharInputReader.updateBuffer(AbstractCharInputReader.java:159) at com.univocity.parsers.common.input.AbstractCharInputReader.start(AbstractCharInputReader.java:145) at com.univocity.parsers.common.AbstractParser.beginParsing(AbstractParser.java:232) at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:523) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42) at scala.collection.immutable.List.map(List.scala:273) at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40) ... 23 more 17/08/21 17:01:44 INFO BlockManagerInfo: Removed broadcast_1_piece0 on 134.100.15.237:63662 in memory (size: 35.7 KB, free: 4.1 GB) 17/08/21 17:01:44 INFO SparkContext: Invoking stop() from shutdown hook 17/08/21 17:01:44 INFO SparkUI: Stopped Spark web UI at http://134.100.15.237:4040 17/08/21 17:01:44 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped! 17/08/21 17:01:44 INFO MemoryStore: MemoryStore cleared 17/08/21 17:01:44 INFO BlockManager: BlockManager stopped 17/08/21 17:01:44 INFO BlockManagerMaster: BlockManagerMaster stopped 17/08/21 17:01:44 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped! 17/08/21 17:01:44 INFO SparkContext: Successfully stopped SparkContext 17/08/21 17:01:44 INFO ShutdownHookManager: Shutdown hook called 17/08/21 17:01:44 INFO ShutdownHookManager: Deleting directory /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-e6c2d680-6e1e-4020-90b4-a63a9a29b698 panchenko@Alexanders-MacBook-Pro:~/Desktop/JoSimText/scripts$

17/08/21 17:21:59 INFO MemoryStore: Block broadcast_0 stored as values in memory (estimated size 277.4 KB, free 4.1 GB)
17/08/21 17:21:59 INFO MemoryStore: Block broadcast_0_piece0 stored as bytes in memory (estimated size 23.4 KB, free 4.1 GB)
17/08/21 17:21:59 INFO BlockManagerInfo: Added broadcast_0_piece0 in memory on 134.100.15.237:64163 (size: 23.4 KB, free: 4.1 GB)
17/08/21 17:21:59 INFO SparkContext: Created broadcast 0 from broadcast at DefaultSource.scala:86
17/08/21 17:21:59 INFO FileSourceScanExec: Planning scan with bin packing, max size: 23280808 bytes, open cost is considered as scanning 4194304 bytes.
17/08/21 17:21:59 INFO SparkContext: Starting job: text at CoNLL2DepTermContext.scala:29
17/08/21 17:21:59 INFO DAGScheduler: Got job 0 (text at CoNLL2DepTermContext.scala:29) with 4 output partitions
17/08/21 17:21:59 INFO DAGScheduler: Final stage: ResultStage 0 (text at CoNLL2DepTermContext.scala:29)
17/08/21 17:21:59 INFO DAGScheduler: Parents of final stage: List()
17/08/21 17:21:59 INFO DAGScheduler: Missing parents: List()
17/08/21 17:21:59 INFO DAGScheduler: Submitting ResultStage 0 (MapPartitionsRDD[4] at text at CoNLL2DepTermContext.scala:29), which has no missing parents
17/08/21 17:21:59 INFO MemoryStore: Block broadcast_1 stored as values in memory (estimated size 99.2 KB, free 4.1 GB)
17/08/21 17:21:59 INFO MemoryStore: Block broadcast_1_piece0 stored as bytes in memory (estimated size 35.7 KB, free 4.1 GB)
17/08/21 17:21:59 INFO BlockManagerInfo: Added broadcast_1_piece0 in memory on 134.100.15.237:64163 (size: 35.7 KB, free: 4.1 GB)
17/08/21 17:21:59 INFO SparkContext: Created broadcast 1 from broadcast at DAGScheduler.scala:1006
17/08/21 17:21:59 INFO DAGScheduler: Submitting 4 missing tasks from ResultStage 0 (MapPartitionsRDD[4] at text at CoNLL2DepTermContext.scala:29) (first 15 tasks are for partitions Vector(0, 1, 2, 3))
17/08/21 17:21:59 INFO TaskSchedulerImpl: Adding task set 0.0 with 4 tasks
17/08/21 17:21:59 INFO TaskSetManager: Starting task 0.0 in stage 0.0 (TID 0, localhost, executor driver, partition 0, PROCESS_LOCAL, 5309 bytes)
17/08/21 17:21:59 INFO TaskSetManager: Starting task 1.0 in stage 0.0 (TID 1, localhost, executor driver, partition 1, PROCESS_LOCAL, 5309 bytes)
17/08/21 17:21:59 INFO TaskSetManager: Starting task 2.0 in stage 0.0 (TID 2, localhost, executor driver, partition 2, PROCESS_LOCAL, 5309 bytes)
17/08/21 17:21:59 INFO TaskSetManager: Starting task 3.0 in stage 0.0 (TID 3, localhost, executor driver, partition 3, PROCESS_LOCAL, 5309 bytes)
17/08/21 17:21:59 INFO Executor: Running task 1.0 in stage 0.0 (TID 1)
17/08/21 17:21:59 INFO Executor: Running task 3.0 in stage 0.0 (TID 3)
17/08/21 17:21:59 INFO Executor: Running task 0.0 in stage 0.0 (TID 0)
17/08/21 17:21:59 INFO Executor: Running task 2.0 in stage 0.0 (TID 2)
17/08/21 17:21:59 INFO Executor: Fetching spark://134.100.15.237:64162/jars/josimtext_2.11-0.4.jar with timestamp 1503328913959
17/08/21 17:21:59 INFO TransportClientFactory: Successfully created connection to /134.100.15.237:64162 after 43 ms (0 ms spent in bootstraps)
17/08/21 17:21:59 INFO Utils: Fetching spark://134.100.15.237:64162/jars/josimtext_2.11-0.4.jar to /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-af65d987-ad1c-44a2-b25b-b399833e3f9a/userFiles-a36d224f-8585-47f6-aba0-b18af4786225/fetchFileTemp5457884890964014860.tmp
17/08/21 17:22:00 INFO Executor: Adding file:/private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-af65d987-ad1c-44a2-b25b-b399833e3f9a/userFiles-a36d224f-8585-47f6-aba0-b18af4786225/josimtext_2.11-0.4.jar to class loader
17/08/21 17:22:00 INFO CodeGenerator: Code generated in 89.202357 ms
17/08/21 17:22:00 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
17/08/21 17:22:00 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
17/08/21 17:22:00 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/08/21 17:22:00 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/08/21 17:22:00 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
17/08/21 17:22:00 INFO FileOutputCommitter: File Output Committer Algorithm version is 1
17/08/21 17:22:00 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/08/21 17:22:00 INFO SQLHadoopMapReduceCommitProtocol: Using output committer class org.apache.hadoop.mapreduce.lib.output.FileOutputCommitter
17/08/21 17:22:00 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 0-23280808, partition values: [empty row]
17/08/21 17:22:00 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 46561616-69842424, partition values: [empty row]
17/08/21 17:22:00 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 69842424-88928930, partition values: [empty row]
17/08/21 17:22:00 INFO FileScanRDD: Reading File path: file:///Users/panchenko/Desktop/test/cc16-conll-copp-sample-newlines.csv, range: 23280808-46561616, partition values: [empty row]
17/08/21 17:22:00 INFO CodeGenerator: Code generated in 10.500173 ms
17/08/21 17:22:00 ERROR Utils: Aborting task
org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 36
    at java.lang.String.getChars(String.java:821)
    at com.univocity.parsers.common.LineReader.read(LineReader.java:51)
    at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:525)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
17/08/21 17:22:00 ERROR Utils: Aborting task
org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.NullPointerException
    at de.uhh.lt.conll.CoNLLParser$.de$uhh$lt$conll$CoNLLParser$$readRow(CoNLLParser.scala:49)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$4.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$4.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:273)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
17/08/21 17:22:00 ERROR Utils: Aborting task
org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more
17/08/21 17:22:00 ERROR Utils: Aborting task
org.apache.spark.SparkException: Failed to execute user defined function(anonfun$2: (string) => array<struct<id:string,form:string,lemma:string,upostag:string,xpostag:string,feats:string,head:string,deprel:string,deps:string,misc:string>>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 36
    at java.lang.String.getChars(String.java:821)
    at com.univocity.parsers.common.LineReader.read(LineReader.java:51)
    at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:525)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$2.apply(CoNLL2DepTermContext.scala:41)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$2.apply(CoNLL2DepTermContext.scala:41)
    ... 23 more
17/08/21 17:22:00 ERROR FileFormatWriter: Job job_20170821172200_0000 aborted.
17/08/21 17:22:00 ERROR FileFormatWriter: Job job_20170821172200_0000 aborted.
17/08/21 17:22:00 ERROR FileFormatWriter: Job job_20170821172200_0000 aborted.
17/08/21 17:22:00 ERROR FileFormatWriter: Job job_20170821172200_0000 aborted.
17/08/21 17:22:00 ERROR Executor: Exception in task 2.0 in stage 0.0 (TID 2)
org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more
17/08/21 17:22:00 ERROR Executor: Exception in task 3.0 in stage 0.0 (TID 3)
org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$2: (string) => array<struct<id:string,form:string,lemma:string,upostag:string,xpostag:string,feats:string,head:string,deprel:string,deps:string,misc:string>>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 36
    at java.lang.String.getChars(String.java:821)
    at com.univocity.parsers.common.LineReader.read(LineReader.java:51)
    at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:525)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$2.apply(CoNLL2DepTermContext.scala:41)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$2.apply(CoNLL2DepTermContext.scala:41)
    ... 23 more
17/08/21 17:22:00 ERROR Executor: Exception in task 0.0 in stage 0.0 (TID 0)
org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: java.lang.NullPointerException
    at de.uhh.lt.conll.CoNLLParser$.de$uhh$lt$conll$CoNLLParser$$readRow(CoNLLParser.scala:49)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$4.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$4.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:273)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
17/08/21 17:22:00 ERROR Executor: Exception in task 1.0 in stage 0.0 (TID 1)
org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: java.lang.StringIndexOutOfBoundsException: String index out of range: 36
    at java.lang.String.getChars(String.java:821)
    at com.univocity.parsers.common.LineReader.read(LineReader.java:51)
    at com.univocity.parsers.common.input.DefaultCharInputReader.reloadBuffer(DefaultCharInputReader.java:75)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:525)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
17/08/21 17:22:00 WARN TaskSetManager: Lost task 2.0 in stage 0.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more

17/08/21 17:22:00 ERROR TaskSetManager: Task 2 in stage 0.0 failed 1 times; aborting job
17/08/21 17:22:00 INFO TaskSetManager: Lost task 1.0 in stage 0.0 (TID 1) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 1]
17/08/21 17:22:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 17:22:00 INFO TaskSetManager: Lost task 3.0 in stage 0.0 (TID 3) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 2]
17/08/21 17:22:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 17:22:00 INFO TaskSetManager: Lost task 0.0 in stage 0.0 (TID 0) on localhost, executor driver: org.apache.spark.SparkException (Task failed while writing rows) [duplicate 3]
17/08/21 17:22:00 INFO TaskSchedulerImpl: Removed TaskSet 0.0, whose tasks have all completed, from pool
17/08/21 17:22:00 INFO TaskSchedulerImpl: Cancelling stage 0
17/08/21 17:22:00 INFO DAGScheduler: ResultStage 0 (text at CoNLL2DepTermContext.scala:29) failed in 0.848 s due to Job aborted due to stage failure: Task 2 in stage 0.0 failed 1 times, most recent failure: Lost task 2.0 in stage 0.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more

Driver stacktrace:
17/08/21 17:22:00 INFO DAGScheduler: Job 0 failed: text at CoNLL2DepTermContext.scala:29, took 1.044480 s
17/08/21 17:22:00 ERROR FileFormatWriter: Aborting job null.
org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 1 times, most recent failure: Lost task 2.0 in stage 0.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:145)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:438)
    at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:474)
    at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
    at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:555)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$.main(CoNLL2DepTermContext.scala:29)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext.main(CoNLL2DepTermContext.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more
Exception in thread "main" org.apache.spark.SparkException: Job aborted.
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:215)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply(FileFormatWriter.scala:173)
    at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:65)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.write(FileFormatWriter.scala:173)
    at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:145)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.datasources.DataSource.writeInFileFormat(DataSource.scala:438)
    at org.apache.spark.sql.execution.datasources.DataSource.write(DataSource.scala:474)
    at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.run(SaveIntoDataSourceCommand.scala:48)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult$lzycompute(commands.scala:58)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.sideEffectResult(commands.scala:56)
    at org.apache.spark.sql.execution.command.ExecutedCommandExec.doExecute(commands.scala:74)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$execute$1.apply(SparkPlan.scala:117)
    at org.apache.spark.sql.execution.SparkPlan$$anonfun$executeQuery$1.apply(SparkPlan.scala:138)
    at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:151)
    at org.apache.spark.sql.execution.SparkPlan.executeQuery(SparkPlan.scala:135)
    at org.apache.spark.sql.execution.SparkPlan.execute(SparkPlan.scala:116)
    at org.apache.spark.sql.execution.QueryExecution.toRdd$lzycompute(QueryExecution.scala:92)
    at org.apache.spark.sql.execution.QueryExecution.toRdd(QueryExecution.scala:92)
    at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:610)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:233)
    at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:217)
    at org.apache.spark.sql.DataFrameWriter.text(DataFrameWriter.scala:555)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$.main(CoNLL2DepTermContext.scala:29)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext.main(CoNLL2DepTermContext.scala)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:755)
    at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:180)
    at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:205)
    at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:119)
    at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 2 in stage 0.0 failed 1 times, most recent failure: Lost task 2.0 in stage 0.0 (TID 2, localhost, executor driver): org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more

Driver stacktrace:
    at org.apache.spark.scheduler.DAGScheduler.org$apache$spark$scheduler$DAGScheduler$$failJobAndIndependentStages(DAGScheduler.scala:1499)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1487)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$abortStage$1.apply(DAGScheduler.scala:1486)
    at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59)
    at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.abortStage(DAGScheduler.scala:1486)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
    at org.apache.spark.scheduler.DAGScheduler$$anonfun$handleTaskSetFailed$1.apply(DAGScheduler.scala:814)
    at scala.Option.foreach(Option.scala:257)
    at org.apache.spark.scheduler.DAGScheduler.handleTaskSetFailed(DAGScheduler.scala:814)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.doOnReceive(DAGScheduler.scala:1714)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1669)
    at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1658)
    at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48)
    at org.apache.spark.scheduler.DAGScheduler.runJob(DAGScheduler.scala:630)
    at org.apache.spark.SparkContext.runJob(SparkContext.scala:2022)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1.apply$mcV$sp(FileFormatWriter.scala:188)
    ... 45 more
Caused by: org.apache.spark.SparkException: Task failed while writing rows
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:272)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:191)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$write$1$$anonfun$apply$mcV$sp$1.apply(FileFormatWriter.scala:190)
    at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87)
    at org.apache.spark.scheduler.Task.run(Task.scala:108)
    at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:335)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
Caused by: org.apache.spark.SparkException: Failed to execute user defined function(anonfun$1: (string) => array<string>)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at scala.collection.Iterator$$anon$12.hasNext(Iterator.scala:439)
    at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408)
    at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source)
    at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43)
    at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:395)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$SingleDirectoryWriteTask.execute(FileFormatWriter.scala:315)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:258)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$$anonfun$org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask$3.apply(FileFormatWriter.scala:256)
    at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1375)
    at org.apache.spark.sql.execution.datasources.FileFormatWriter$.org$apache$spark$sql$execution$datasources$FileFormatWriter$$executeTask(FileFormatWriter.scala:261)
    ... 8 more
Caused by: com.univocity.parsers.common.TextParsingException: java.lang.ArrayIndexOutOfBoundsException - null
Parser Configuration: CsvParserSettings:
    Auto configuration enabled=true
    Autodetect column delimiter=false
    Autodetect quotes=false
    Column reordering enabled=true
    Empty value=null
    Escape unquoted values=false
    Header extraction enabled=null
    Headers=null
    Ignore leading whitespaces=true
    Ignore trailing whitespaces=true
    Input buffer size=1048576
    Input reading on separate thread=false
    Keep escape sequences=false
    Keep quotes=false
    Length of content displayed on error=-1
    Line separator detection enabled=false
    Maximum number of characters per column=4096
    Maximum number of columns=512
    Normalize escaped line separators=true
    Null value=
    Number of records to read=all
    Processor=none
    Restricting data in exceptions=false
    RowProcessor error handler=null
    Selected fields=none
    Skip empty lines=true
    Unescaped quote handling=nullFormat configuration:
    CsvFormat:
        Comment character=#
        Field delimiter=\t
        Line separator (normalized)=\n
        Line separator sequence=\n
        Quote character=\0
        Quote escape character=\0
        Quote escape escape character=null
Internal state when error was thrown: line=39, column=0, record=35, charIndex=1182
    at com.univocity.parsers.common.AbstractParser.handleException(AbstractParser.java:339)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:195)
    at com.univocity.parsers.common.AbstractParser.parseLine(AbstractParser.java:544)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at de.uhh.lt.conll.CoNLLParser$$anonfun$3.apply(CoNLLParser.scala:42)
    at scala.collection.immutable.List.map(List.scala:277)
    at de.uhh.lt.conll.CoNLLParser$.parseSingleSentence(CoNLLParser.scala:42)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    at de.uhh.lt.jst.dt.CoNLL2DepTermContext$$anonfun$1.apply(CoNLL2DepTermContext.scala:40)
    ... 23 more
Caused by: java.lang.ArrayIndexOutOfBoundsException
    at java.lang.System.arraycopy(Native Method)
    at com.univocity.parsers.common.ParserOutput.rowParsed(ParserOutput.java:166)
    at com.univocity.parsers.common.AbstractParser.handleEOF(AbstractParser.java:189)
    ... 30 more
17/08/21 17:22:00 INFO SparkContext: Invoking stop() from shutdown hook
17/08/21 17:22:00 INFO SparkUI: Stopped Spark web UI at http://134.100.15.237:4040
17/08/21 17:22:00 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
17/08/21 17:22:00 INFO MemoryStore: MemoryStore cleared
17/08/21 17:22:00 INFO BlockManager: BlockManager stopped
17/08/21 17:22:00 INFO BlockManagerMaster: BlockManagerMaster stopped
17/08/21 17:22:00 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
17/08/21 17:22:00 INFO SparkContext: Successfully stopped SparkContext
17/08/21 17:22:00 INFO ShutdownHookManager: Shutdown hook called
17/08/21 17:22:00 INFO ShutdownHookManager: Deleting directory /private/var/folders/tf/cy2lzyld3rz6mg8tqxm7zstr0000gn/T/spark-af65d987-ad1c-44a2-b25b-b399833e3f9a
panchenko@Alexanders-MacBook-Pro:~/Desktop/JoSimText/scripts$

fmarten / JoSimText

CoNLL processing exception #2