RedisLabs / spark-redis

A connector for Spark that allows reading and writing to/from Redis cluster
BSD 3-Clause "New" or "Revised" License
936 stars 367 forks source link

NullPointerException save to redis cluster #376

Open yuanlin-work opened 1 year ago

yuanlin-work commented 1 year ago

got error when dataframe save to redis cluster: java.lang.NullPointerException at java.util.regex.Matcher.getTextLength(Matcher.java:1283) at java.util.regex.Matcher.reset(Matcher.java:309) at java.util.regex.Matcher.(Matcher.java:229) at java.util.regex.Pattern.matcher(Pattern.java:1093) at scala.util.matching.Regex.findFirstIn(Regex.scala:388) at org.apache.spark.util.Utils$$anonfun$redact$1$$anonfun$apply$15.apply(Utils.scala:2643) at org.apache.spark.util.Utils$$anonfun$redact$1$$anonfun$apply$15.apply(Utils.scala:2643) at scala.Option.orElse(Option.scala:289) at org.apache.spark.util.Utils$$anonfun$redact$1.apply(Utils.scala:2643) at org.apache.spark.util.Utils$$anonfun$redact$1.apply(Utils.scala:2641) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.TraversableLike$$anonfun$map$1.apply(TraversableLike.scala:234) at scala.collection.mutable.ResizableArray$class.foreach(ResizableArray.scala:59) at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:48) at scala.collection.TraversableLike$class.map(TraversableLike.scala:234) at scala.collection.AbstractTraversable.map(Traversable.scala:104) at org.apache.spark.util.Utils$.redact(Utils.scala:2641) at org.apache.spark.util.Utils$.redact(Utils.scala:2608) at org.apache.spark.sql.internal.SQLConf$$anonfun$redactOptions$1.apply(SQLConf.scala:2083) at org.apache.spark.sql.internal.SQLConf$$anonfun$redactOptions$1.apply(SQLConf.scala:2083) at scala.collection.LinearSeqOptimized$class.foldLeft(LinearSeqOptimized.scala:124) at scala.collection.immutable.List.foldLeft(List.scala:84) at org.apache.spark.sql.internal.SQLConf.redactOptions(SQLConf.scala:2083) at org.apache.spark.sql.execution.datasources.SaveIntoDataSourceCommand.simpleString(SaveIntoDataSourceCommand.scala:52) at org.apache.spark.sql.catalyst.plans.QueryPlan.verboseString(QueryPlan.scala:177) at org.apache.spark.sql.catalyst.trees.TreeNode.generateTreeString(TreeNode.scala:548) at org.apache.spark.sql.catalyst.trees.TreeNode.treeString(TreeNode.scala:472) at org.apache.spark.sql.execution.QueryExecution$$anonfun$4.apply(QueryExecution.scala:197) at org.apache.spark.sql.execution.QueryExecution$$anonfun$4.apply(QueryExecution.scala:197) at org.apache.spark.sql.execution.QueryExecution.stringOrError(QueryExecution.scala:99) at org.apache.spark.sql.execution.QueryExecution.toString(QueryExecution.scala:197) at org.apache.spark.sql.execution.SQLExecution$$anonfun$withNewExecutionId$1.apply(SQLExecution.scala:75) at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:125) at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:73) at org.apache.spark.sql.DataFrameWriter.runCommand(DataFrameWriter.scala:676) at org.apache.spark.sql.DataFrameWriter.saveToV1Source(DataFrameWriter.scala:285) at org.apache.spark.sql.DataFrameWriter.save(DataFrameWriter.scala:271) at org.yl.excutor.SparkApplication.start(SparkApplication.java:271)

my code like this: datasetResult.write() .format("org.apache.spark.sql.redis") .mode("append") .option("table", jobDetail.getTo_table()) .option("host", jobDetail.getTo_host()) .option("port", jobDetail.getTo_port()) .option("auth", jobDetail.getTo_passwd()) .option("ttl", jobDetail.getRedis_table_ttl()) .option("key.column", "id").save();

and the code below can print: datasetResult.show();

+---+---------+---+------------------+ | id| name|age| address| +---+---------+---+------------------+ | 1| zhangsan| 22|fsdfsadfasdfsd| | 2|hanmeimei| 23| bbbbccccddd| +---+---------+---+------------------+

but when the jobDetail.getTo_host() point to a redis standalone ,it works my dependencis: `

com.redislabs spark-redis_2.11 2.4.2

`

anything to help? tks