memsql / singlestore-spark-connector

A connector for SingleStore and Spark
Apache License 2.0
160 stars 54 forks source link

[Help appreciated] Issue when writing to memsql #40

Closed michema closed 6 years ago

michema commented 6 years ago

Hi,

I am trying to use memsql-spark-connector to write spark dataframe to memsql. However it gave me this error. I used group by to generate the Spark dataframe. Can anyone help? Thanks

java.lang.NullPointerException at org.apache.spark.sql.execution.aggregate.HashAggregateExec.createHashMap(HashAggregateExec.scala:311) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.agg_doAggregateWithKeys$(Unknown Source) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIterator.processNext(Unknown Source) at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:43) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$8$$anon$1.hasNext(WholeStageCodegenExec.scala:377) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$$anon$11.hasNext(Iterator.scala:408) at scala.collection.Iterator$class.foreach(Iterator.scala:893) at scala.collection.AbstractIterator.foreach(Iterator.scala:1336) at com.memsql.spark.connector.LoadDataStrategy$$anon$2.run(LoadDataStrategy.scala:52) at java.lang.Thread.run(Thread.java:745)

lucyyu commented 6 years ago

Do you have more information, such as the code where you set up the Spark configuration settings "spark.memsql.host", etc and the code to create the Dataframe with the group by query?

michema commented 6 years ago

@lucyyu I was using 2.0.2 as suggested in README. The latest version 2.0.4 fixes this issue: https://github.com/memsql/memsql-spark-connector/commit/5ba2c86b6fb0c0a711677cfced37eed45352d97f

lucyyu commented 6 years ago

Good for us to know, thanks for the update. We will add a change to the README.