RedisLabs / spark-redis

A connector for Spark that allows reading and writing to/from Redis cluster
BSD 3-Clause "New" or "Revised" License
940 stars 372 forks source link

Strange issue when trying to connect to RedisLabs managed cluster #18

Closed ScottWang closed 8 years ago

ScottWang commented 8 years ago

I encountered a strange issue. When I was developing/testing with local redis server everything worked, but when I tried to connect to a cluster that managed by RedisLabs.

Code Snippets: val sc = new SparkContext(new SparkConf() .setMaster("local") .setAppName("PreProcess") .set("redis.host", "10.10.10.10") .set("redis.port", "12345") .set("redis.auth", "xxxxxxxx"))

valueTuple.foreach(each => sc.toRedisHASH(sc.parallelize((each._2).toList), "hfollow"+each._1))

I encountered the following error: ============================ error log ============================== Exception in thread "main" java.lang.NegativeArraySizeException

at redis.clients.jedis.Protocol.processBulkReply(Protocol.java:159)

at redis.clients.jedis.Protocol.process(Protocol.java:136)

at redis.clients.jedis.Protocol.read(Protocol.java:196)

at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:288)

at redis.clients.jedis.Connection.getBinaryBulkReply(Connection.java:207)

at redis.clients.jedis.Connection.getBulkReply(Connection.java:196)

at redis.clients.jedis.BinaryJedis.info(BinaryJedis.java:2671)

at com.redislabs.provider.redis.RedisConfig.clusterEnabled(RedisConfig.scala:146)

at com.redislabs.provider.redis.RedisConfig.getNodes(RedisConfig.scala:252)

at com.redislabs.provider.redis.RedisConfig.getHosts(RedisConfig.scala:168)

at com.redislabs.provider.redis.RedisConfig.(RedisConfig.scala:96)

at com.redislabs.provider.redis.RedisContext.toRedisHASH$default$4(redisFunctions.scala:58)

at PreProcess$$anonfun$main$6.apply(preprocess.scala:115)

at PreProcess$$anonfun$main$6.apply(preprocess.scala:115)

at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33)

at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108)

at PreProcess$.main(preprocess.scala:115)

at PreProcess.main(preprocess.scala)

at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)

at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)

at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)

at java.lang.reflect.Method.invoke(Method.java:497)

at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:731)

at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:181)

at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:206)

at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:121)

at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)

dvirsky commented 8 years ago

@sunheehnus can you please have a look at this ASAP? let me know if you need help setting up an RL cluster

sunheehnus commented 8 years ago

Hello @ScottWang , thanks for reporting! The issue is strange. According to SparkConf.set('redis.auth', 'xxxxx') you are using RL's standalone instance not cluster mode right? And according to the stack trace the server's reply for info cluster is $x\r\n and x is a negative number(not -1). Could you reproduce this issue frequently? Or could you please try info cluster with redis-cli to see what it returns? Thanks very much

sunheehnus commented 8 years ago

CC @ScottWang @dvirsky ps: I ran the following code which makes use of RL service, but it works for me, so I ask if you can reproduce this...

def main(args: Array[String]) {
    val sc = new SparkContext(new SparkConf()
      .setMaster("local")
      .setAppName("PreProcess")
      .set("redis.host", "pub-redis-17519.us-east-1-4.2.ec2.garantiadata.com")
      .set("redis.port", "17519")
    )
    val valueTuple = List(("Hello", Array(("1", "2"), ("2", "3"))), ("World", Array(("1", "2"), ("2", "3"))))
    valueTuple.foreach(each => sc.toRedisHASH(sc.parallelize((each._2).toList), "h_follow"+each._1))
}

If the code doesn't present the senerio you encounter or if I mistake your issue, please let me know, thanks very much :-)

ScottWang commented 8 years ago

@sunheehnus, your code snippets look right.

It's real strange that error will only happen when I trying to use RL managed redis instance. Just to be clear, we seek out RL to help us provision and managing our internal redis instance. No error when using my local build redis server.

I went back and pulled the latest from the github(c043db8), did my tests again. This time it gave me a different error and I could reproduce the error consistently.

Thank you so much for looking into this. ======= error log =========================== Exception in thread "main" redis.clients.jedis.exceptions.JedisConnectionException: Could not get a resource from the pool at redis.clients.util.Pool.getResource(Pool.java:50) at redis.clients.jedis.JedisPool.getResource(JedisPool.java:86) at com.redislabs.provider.redis.RedisEndpoint.connect(RedisConfig.scala:73) at com.redislabs.provider.redis.RedisConfig.clusterEnabled(RedisConfig.scala:145) at com.redislabs.provider.redis.RedisConfig.getNodes(RedisConfig.scala:252) at com.redislabs.provider.redis.RedisConfig.getHosts(RedisConfig.scala:168) at com.redislabs.provider.redis.RedisConfig.(RedisConfig.scala:96) at com.redislabs.provider.redis.RedisContext.toRedisHASH$default$4(redisFunctions.scala:58) at PreProcess$$anonfun$main$6.apply(preprocess.scala:123) at PreProcess$$anonfun$main$6.apply(preprocess.scala:123) at scala.collection.IndexedSeqOptimized$class.foreach(IndexedSeqOptimized.scala:33) at scala.collection.mutable.ArrayOps$ofRef.foreach(ArrayOps.scala:108) at PreProcess$.main(preprocess.scala:123) at PreProcess.main(preprocess.scala) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) at java.lang.reflect.Method.invoke(Method.java:606) at org.apache.spark.deploy.SparkSubmit$.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:664) at org.apache.spark.deploy.SparkSubmit$.doRunMain$1(SparkSubmit.scala:169) at org.apache.spark.deploy.SparkSubmit$.submit(SparkSubmit.scala:192) at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:111) at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala) Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR Client sent AUTH, but no password is set at redis.clients.jedis.Protocol.processError(Protocol.java:117) at redis.clients.jedis.Protocol.process(Protocol.java:142) at redis.clients.jedis.Protocol.read(Protocol.java:196) at redis.clients.jedis.Connection.readProtocolWithCheckingBroken(Connection.java:288) at redis.clients.jedis.Connection.getStatusCodeReply(Connection.java:187) at redis.clients.jedis.BinaryJedis.auth(BinaryJedis.java:2001) at redis.clients.jedis.JedisFactory.makeObject(JedisFactory.java:87) at org.apache.commons.pool2.impl.GenericObjectPool.create(GenericObjectPool.java:819) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:429) at org.apache.commons.pool2.impl.GenericObjectPool.borrowObject(GenericObjectPool.java:360) at redis.clients.util.Pool.getResource(Pool.java:48) ... 22 more

=============================== our redis server info============================

info cluster

Server

redis_version:2.8.21 redis_git_sha1:00000000 redis_git_dirty:0 redis_build_id:0000000000000000000000000000000000000000 redis_mode:standalone os:Linux 3.13.0-36-generic x86_64 arch_bits:64 multiplexing_api:epoll gcc_version:4.8.2 process_id:4 run_id:13443d7639782533bab44798ca69968a9dfca62d tcp_port:19066 uptime_in_seconds:4914523 uptime_in_days:56 hz:10 lru_clock:0

Clients

connected_clients:1 client_longest_output_list:0 client_biggest_input_buf:0 blocked_clients:0

Memory

used_memory:11200056 used_memory_human:10.68M used_memory_rss:11200056 used_memory_peak:14508960 used_memory_peak_human:13.83M used_memory_lua:36864 mem_fragmentation_ratio:1 mem_allocator:jemalloc-3.6.0

Persistence

loading:0 rdb_changes_since_last_save:706 rdb_bgsave_in_progress:0 rdb_last_save_time:1452807404 rdb_last_bgsave_status:ok rdb_last_bgsave_time_sec:0 rdb_current_bgsave_time_sec:-1 aof_enabled:1 aof_rewrite_in_progress:0 aof_rewrite_scheduled:0 aof_last_rewrite_time_sec:-1 aof_current_rewrite_time_sec:-1 aof_last_bgrewrite_status:ok aof_last_write_status:ok aof_current_size:0 aof_base_size:0 aof_pending_rewrite:0 aof_buffer_length:0 aof_rewrite_buffer_length:0 aof_pending_bio_fsync:0 aof_delayed_fsync:0

Stats

total_connections_received:40 total_commands_processed:1150 instantaneous_ops_per_sec:0 rejected_connections:0 sync_full:0 sync_partial_ok:0 sync_partial_err:0 expired_keys:0 evicted_keys:0 keyspace_hits:642 keyspace_misses:36 pubsub_channels:0 pubsub_patterns:0 latest_fork_usec:0

Replication

role:master connected_slaves:0 master_repl_offset:0 repl_backlog_active:0 repl_backlog_size:1048576 repl_backlog_first_byte_offset:0 repl_backlog_histlen:0

CPU

used_cpu_sys:0.00 used_cpu_user:0.00 used_cpu_sys_children:0.00 used_cpu_user_children:0.00

Keyspace

db0:keys=10,expires=0,avg_ttl=0 redis-xxxxx.rlec-web-dev.xxxxx.xxxxx.xx:xxxxx>

sunheehnus commented 8 years ago

Hello @ScottWang , thanks very much for the reporting. :-) And the last broke is I didn't add 'auth' to the former RL instance. I have already request a new instance with auth, the test code is:

def main(args: Array[String]) {
    val sc = new SparkContext(new SparkConf()
      .setMaster("local")
      .setAppName("PreProcess")
      .set("redis.host", "pub-redis-16119.us-east-1-4.4.ec2.garantiadata.com")
      .set("redis.port", "16119")
      .set("redis.auth", "xxxxx")
    )
    val valueTuple = List(("Hello", Array(("1", "2"), ("2", "3"))), ("World", Array(("1", "2"), ("2", "3"))))
    valueTuple.foreach(each => sc.toRedisHASH(sc.parallelize((each._2).toList), "h_follow"+each._1))
}

As @Yiftach said,

RL doesn't support the OSS 'info cluster' because for the client RL Cluster looks like as regular single shard Redis but behind the scene (i.e,. the proxy) we shard Redis​.

I add a new commit to Spark-Redis, could you please try this? Thanks again for the reporting. :-)

ScottWang commented 8 years ago

Hi @sunheehnus,

I just pulled the latest (d5bc746) and did a test. I am still getting the same error (3/11/2016) as the above.

dvirsky commented 8 years ago

@ScottWang can you paste here the exact code you are trying to run?

dvirsky commented 8 years ago

@ScottWang your error shows Caused by: redis.clients.jedis.exceptions.JedisDataException: ERR Client sent AUTH, but no password is set. Looks like your RLEC cluster doesn't have a password set, but your spark config does?

ScottWang commented 8 years ago

I used redis-cli to access the server via: redis-cli -h xxxxxxx.xxxx.xx -p 19066 -a xxxxxx and was able to get to the server.

After that I went back to the server configuration and reset the password and now everything is working, really not sure what happened.

Regardless, thank you guys for the new build and everything is working now, on with the load testing next.

dvirsky commented 8 years ago

Cool! I guess we can close this issue, right?

ScottWang commented 8 years ago

Yes, please. Again, thank you so much.

On Tue, Mar 15, 2016 at 11:23 AM, Dvir Volk notifications@github.com wrote:

Cool! I guess we can close this issue, right?

— You are receiving this because you were mentioned. Reply to this email directly or view it on GitHub https://github.com/RedisLabs/spark-redis/issues/18#issuecomment-196960261

dvirsky commented 8 years ago

10x5ns