Closed Akhilj786 closed 3 years ago
Compiled the jar (branch : https://github.com/RedisLabs/spark-redis/tree/build_scala_2.12 which support 3.x and 2.12 scala version) and uploaded to the cluster Still no luck.
Prior issues: https://github.com/RedisLabs/spark-redis/issues/237#issuecomment-668267729
Hi @Akhilj786 ,
how do you configure parameters like spark.redis.host
and spark.redis.port
?
Could you print them in your databricks notebook to check they are configured?
@fe2s I have set them in the cluster creation step and also tried with running in spark session
Seeing the same issue with a Java app on Databricks trying to connect to Azure Redis Cache instance. Any suggestions on a possible workaround?
1/06/02 19:30:05 ERROR GnssMonitor: Caught an exception while analyzing a batch in the GNSS Monitor job
redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool
at redis.clients.jedis.util.Pool.getResource(Pool.java:59)
at redis.clients.jedis.JedisPool.getResource(JedisPool.java:234)
at com.redislabs.provider.redis.ConnectionPool$.connect(ConnectionPool.scala:35)
at com.redislabs.provider.redis.RedisEndpoint.connect(RedisConfig.scala:72)
at com.redislabs.provider.redis.RedisConfig.clusterEnabled(RedisConfig.scala:196)
at com.redislabs.provider.redis.RedisConfig.getNodes(RedisConfig.scala:320)
at com.redislabs.provider.redis.RedisConfig.getHosts(RedisConfig.scala:236)
at com.redislabs.provider.redis.RedisConfig.<init>(RedisConfig.scala:135)
at org.apache.spark.sql.redis.RedisSourceRelation.<init>(RedisSourceRelation.scala:34)
at org.apache.spark.sql.redis.DefaultSource.createRelation(DefaultSource.scala:13)
at org.apache.spark.sql.execution.datasources.DataSource.resolveRelation(DataSource.scala:390)
at org.apache.spark.sql.DataFrameReader.loadV1Source(DataFrameReader.scala:424)
at org.apache.spark.sql.DataFrameReader.$anonfun$load$3(DataFrameReader.scala:391)
at scala.Option.getOrElse(Option.scala:189)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:391)
at org.apache.spark.sql.DataFrameReader.load(DataFrameReader.scala:264)
at com.chesapeaketechnology.datasci.gnssmonitor.GnssMonitor.analyzeBatch(GnssMonitor.java:181)
at org.apache.spark.sql.streaming.DataStreamWriter.$anonfun$foreachBatch$1(DataStreamWriter.scala:613)
at org.apache.spark.sql.streaming.DataStreamWriter.$anonfun$foreachBatch$1$adapted(DataStreamWriter.scala:613)
at org.apache.spark.sql.execution.streaming.sources.ForeachBatchSink.addBatch(ForeachBatchSink.scala:38)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$16(MicroBatchExecution.scala:606)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$5(SQLExecution.scala:126)
at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:267)
at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withCustomExecutionEnv$1(SQLExecution.scala:104)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:852)
at org.apache.spark.sql.execution.SQLExecution$.withCustomExecutionEnv(SQLExecution.scala:77)
at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:217)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runBatch$15(MicroBatchExecution.scala:604)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:293)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:291)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:73)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runBatch(MicroBatchExecution.scala:604)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$4(MicroBatchExecution.scala:243)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.withSchemaEvolution(MicroBatchExecution.scala:647)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$2(MicroBatchExecution.scala:240)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken(ProgressReporter.scala:293)
at org.apache.spark.sql.execution.streaming.ProgressReporter.reportTimeTaken$(ProgressReporter.scala:291)
at org.apache.spark.sql.execution.streaming.StreamExecution.reportTimeTaken(StreamExecution.scala:73)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.$anonfun$runActivatedStream$1(MicroBatchExecution.scala:209)
at org.apache.spark.sql.execution.streaming.ProcessingTimeExecutor.execute(TriggerExecutor.scala:57)
at org.apache.spark.sql.execution.streaming.MicroBatchExecution.runActivatedStream(MicroBatchExecution.scala:203)
at org.apache.spark.sql.execution.streaming.StreamExecution.$anonfun$runStream$1(StreamExecution.scala:366)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:852)
at org.apache.spark.sql.execution.streaming.StreamExecution.org$apache$spark$sql$execution$streaming$StreamExecution$$runStream(StreamExecution.scala:341)
at org.apache.spark.sql.execution.streaming.StreamExecution$$anon$1.run(StreamExecution.scala:268)
Caused by: redis.clients.jedis.exceptions.JedisConnectionException: java.net.SocketTimeoutException: Read timed out
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:205)
at redis.clients.jedis.util.RedisInputStream.readByte(RedisInputStream.java:43)
at redis.clients.jedis.Protocol.process(Protocol.java:155)
at redis.clients.jedis.Protocol.read(Protocol.java:220)
at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:318)
at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:236)
at redis.clients.jedis.BinaryJedis.auth(BinaryJedis.java:2259)
at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:119)
at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:889)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:424)
at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:349)
at redis.clients.jedis.util.Pool.getResource(Pool.java:50)
... 46 more
Caused by: java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at java.net.SocketInputStream.read(SocketInputStream.java:127)
at redis.clients.jedis.util.RedisInputStream.ensureFill(RedisInputStream.java:199)
... 57 more
21/06/02 19:30:35 WARN AzureFileSystemThreadPoolExecutor: Disabling threads for Delete operation as thread count 0 is <= 1
Actually, in my case, the solution was to specify spark.redis.ssl true
in the Spark Config.
@fe2s @dtufekcic for now we circumvent this problem with vnet peering (Databricks) <--> Azure Cache Redis (vnet with premium subscription). We utilized 3679 port(non-ssl) as opposed to 3680(ssl port).
We are leveraging spark on databricks(azure). Spark Version:
3.1.0
I have compiled and upload the jar to one of the databricks cluster using the branch: https://github.com/RedisLabs/spark-redis/tree/build_scala_2.12
Configuration:
Error:
--- Sample implementation ---