gomodule / redigo

Go client for Redis
Apache License 2.0
9.76k stars 1.25k forks source link

reply.go sliceHelper does not handle Redis errors within the slice #579

Closed mitchsw closed 2 years ago

mitchsw commented 3 years ago

I have the following line of code:

values, err := redis.ByteSlices(conn.Do("MGET", params...))

Today I investigated the following error:

redigo: unexpected element type for ByteSlices, got type redis.Error

This error was very surprising! It comes from here, which suggests that the conn.Do() reply was indeed a interface{}[], however one of the elements in the interface slice was a redis.Error.

This is technically possible based on RESP. You can have a RESP array where one of the elements is a RESP error. In redigo, this would hit this readReply for the array, and this readReply for the inner error. Such a response is poorly handled by sliceHelper.

This likely hasn't been seen before because a normal MGET etc would never reply with a slice where one element is an error. However, when using the Envoy Redis proxy with a partitioned cluster, this is possible if one of the endpoints is unavailable. I have quickly reproduced this with:

Redis partition 1/3:

127.0.0.1:6379> CLIENT PAUSE 60000 ALL
OK

Envoy Redis proxy:

127.0.0.1:6379> MGET a b c d e f g h i j k l m n o p
 1) (nil)
 2) (nil)
 3) (nil)
 4) (nil)
 5) (error) upstream failure
 6) (error) upstream failure
 7) (error) upstream failure
 8) (error) upstream failure
 9) (nil)
10) (nil)
11) (error) upstream failure
12) (error) upstream failure
13) (nil)
14) (nil)
15) (nil)
16) (nil)
(3.60s)

This reply will trigger the strange error message I saw above.

I think redigo could better handle this failure mode by propagating at least one of the array errors to the caller, e.g.:

redigo: slice contained an error: upstream failure
stevenh commented 3 years ago

Seems like the upstream failure was propagated, so could you clarify how you think this could be handled better?

mitchsw commented 3 years ago

Thanks for the reply @stevenh. I could have been clearer, sorry. The return value of

values, err := redis.ByteSlices(conn.Do("MGET", params...))

was

nil, Error("redigo: unexpected element type for ByteSlices, got type redis.Error")

So no, redigo didn't propagate the upstream failure, instead the library just told me there was an error without shedding light on what error it was. It also manifests as more of an internal "unexpected" error, when in fact the response was a valid reply.

It could be handled better by maybe just propagating the first error, e.g.

nil, Error("upstream failure")

or some kind of annotation, e.g.

nil, Error("redigo: slice contained an error: upstream failure")
stevenh commented 3 years ago

Thanks for clarifying @mitchsw!

stevenh commented 3 years ago

Can you try https://github.com/gomodule/redigo/pull/580 @mitchsw and see if that addresses the issue?

stevenh commented 3 years ago

Just checking in to see if you have managed to test #580 yet @mitchsw ?