java.sql.BatchUpdateException

thbeh commented 7 years ago

I was trying to write to column table and got the following. This is how I run the sprk-shell

/opt/mapr/spark/spark-2.0.1/bin/spark-shell --master yarn --conf spark.snappydata.store.locators=192.168.100.109:10334 --packages "SnappyDataInc:snappydata:0.8-s_2.11"

scala> dataFrame.write.insertInto("TestColumnTable") _[Stage 1:> (0 + 1) / 2]17/04/07 20:00:40 WARN TaskSetManager: Lost task 0.0 in stage 1.0 (TID 1, node1): java.io.IOException: java.sql.BatchUpdateException: (Server=/192.168.100.109[1528] Thread=pool-3-thread-14) The exception 'com.gemstone.gemfire.internal.cache.PutAllPartialResultException: Key tableinfo(true).CompactCompositeRegionKey@7552e998=(112) and possibly others failed to put due to java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NullPointerException The putAll operation failed to put 500 out of 500 entries. ' was thrown while evaluating an expression. at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.columnar.impl.JDBCSourceAsColumnarStore$$anonfun$doInsert$1.apply(JDBCSourceAsColumnarStore.scala:125) at org.apache.spark.sql.execution.columnar.impl.JDBCSourceAsColumnarStore$$anonfun$doInsert$1.apply(JDBCSourceAsColumnarStore.scala:81) at org.apache.spark.sql.execution.columnar.ExternalStore$class.tryExecute(ExternalStore.scala:51) at org.apache.spark.sql.execution.columnar.JDBCSourceAsStore.tryExecute(JDBCSourceAsStore.scala:46) at org.apache.spark.sql.execution.columnar.JDBCSourceAsStore.storeColumnBatch(JDBCSourceAsStore.scala:74) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.columninsert_storeColumnBatch$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:370) at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:246) at org.apache.spark.sql.execution.SparkPlan$$anonfun$4.apply(SparkPlan.scala:240) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsInternal$1$$anonfun$apply$24.apply(RDD.scala:803) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:319) at org.apache.spark.rdd.RDD.iterator(RDD.scala:283) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:70) at org.apache.spark.scheduler.Task.run(Task.scala:86) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:274) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Caused by: java.sql.BatchUpdateException: (Server=/192.168.100.109[1528] Thread=pool-3-thread-14) The exception 'com.gemstone.gemfire.internal.cache.PutAllPartialResultException: Key tableinfo(true).CompactCompositeRegionKey@7552e998=(112) and possibly others failed to put due to java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.NullPointerException The putAll operation failed to put 500 out of 500 entries. ' was thrown while evaluating an expression. at io.snappydata.thrift.SnappyDataService$executePreparedBatch_result$executePreparedBatch_resultStandardScheme.read(SnappyDataService.java:15625) at io.snappydata.thrift.SnappyDataService$executePreparedBatch_result$executePreparedBatch_resultStandardScheme.read(SnappyDataService.java:15602) at io.snappydata.thrift.SnappyDataService$executePreparedBatch_result.read(SnappyDataService.java:15541) at io.snappydata.org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86) at io.snappydata.thrift.SnappyDataService$Client.recvexecutePreparedBatch(SnappyDataService.java:461) at io.snappydata.thrift.SnappyDataService$Client.executePreparedBatch(SnappyDataService.java:445) at io.snappydata.thrift.internal.ClientService.executePreparedBatch(ClientService.java:1290) at io.snappydata.thrift.internal.ClientPreparedStatement.executeBatch(ClientPreparedStatement.java:278) ... 24 more

jramnara commented 7 years ago

hmm! seems like a NPE bug. Can you print your Schema? (dataframe.printSchema ?) And, if possible send us your entire Log that includes this exception. Should be in your working_directory/server../snappyserver.log

thbeh commented 7 years ago

Hi Jags,

I am running Snappydata off the multiple container as perhttps://github.com/SnappyDataInc/snappy-cloud-tools/tree/master/docker#using-multiple-containers-with-docker-compose. So I am attaching the snappyserver.log from the server container. Hope that is the one you are referring to.

And this the schema from the error - scala> snSession.sql("create table TestColumnTable (id bigint not null, k bigint not null) using column") res1: org.apache.spark.sql.CachedDataFrame = []

snappyserver.log.zip

hbhanawat commented 7 years ago

For debugging, can you try your scenario with snappy-spark instead of mapr's spark? Your server logs indicate that executor (i.e. snappy server) could not fetch some information from the hive's store.

thbeh commented 7 years ago

@hbhanawat snappy-spark works without any issue but I have to run on MapR's spark. The only difference is spark's version. In snappy is 2.0.2.3 and in MapR it is 2.0.1

sumwale commented 7 years ago

@thbeh Can you attach snappydata server+lead logs as well as spark-shell logs?

sumwale commented 7 years ago

@thbeh Can you try with latest release (1.0 RC, 1.0 will be out later this week)? It is compatible with Spark 2.1.1 so please test with compatible MapR version.

TIBCOSoftware / snappydata

java.sql.BatchUpdateException #556