Open chrisbenincasa opened 7 years ago
This definitely is a bug, but I think it's a bit more nuanced than handling the QUEUED response. Are you willing to do a bit of debugging? It would be really helpful to see what's going over the wire when this race condition hits.
Questions:
@nepthar definitely willing to debug. I was not able to reproduce the NoSuchMethodError (I think that was a bug on my end), but I can reproduce the AbstractMethodError consistently. I setup a quick test in the form I described above. Here are some details:
The test:
it should "fail" in {
val underlyingClient = client.redisClient
val keyVals = List.fill(10)(Buf.Utf8(randomString) -> Buf.Utf8(randomString))
val cmds = Seq(
MSet(keyVals.toMap)
) ++ keyVals.map { case (k, _) => PExpire(k, 100) }
underlyingClient.transaction(cmds)
val fut = underlyingClient.mGet(keyVals.map { case (k, _) => k }).map(res => {
res.foreach(println)
})
com.twitter.util.Await.result(fut, com.twitter.util.Duration.Top)
}
The breakpoint on QUEUED
getting hit:
The redis monitor output:
) redis-cli -p 30144 monitor
OK
1493925332.472761 [0 172.17.0.1:41410] "MULTI"
1493925332.495212 [0 172.17.0.1:41410] "MSET" "g8iEo" "0qHv1" "9WpFi" "ybjwz" "4ydI3" "8uRXO" "oO29P" "rJWKH" "Chget" "CzTXa" "19J6T" "JhJAl" "jln65" "wW5pg" "o3w9L" "kumUR" "OrOfM" "36eE2" "PLz74" "EyYIj"
1493925332.495270 [0 172.17.0.1:41410] "PEXPIRE" "o3w9L" "100"
1493925332.495296 [0 172.17.0.1:41410] "PEXPIRE" "jln65" "100"
1493925332.495318 [0 172.17.0.1:41410] "PEXPIRE" "g8iEo" "100"
1493925332.495340 [0 172.17.0.1:41410] "PEXPIRE" "4ydI3" "100"
1493925332.495362 [0 172.17.0.1:41410] "PEXPIRE" "Chget" "100"
1493925332.495384 [0 172.17.0.1:41410] "PEXPIRE" "PLz74" "100"
1493925332.495405 [0 172.17.0.1:41410] "PEXPIRE" "OrOfM" "100"
1493925332.495428 [0 172.17.0.1:41410] "PEXPIRE" "oO29P" "100"
1493925332.495450 [0 172.17.0.1:41410] "PEXPIRE" "19J6T" "100"
1493925332.495471 [0 172.17.0.1:41410] "PEXPIRE" "9WpFi" "100"
1493925332.495493 [0 172.17.0.1:41410] "MGET" "o3w9L" "jln65" "g8iEo" "4ydI3" "Chget" "PLz74" "OrOfM" "oO29P" "19J6T" "9WpFi"
1493925332.495557 [0 172.17.0.1:41410] "EXEC"
Test output:
[info] - should fail *** FAILED ***
[info] java.lang.ClassCastException: scala.runtime.BoxedUnit cannot be cast to scala.collection.Seq
[info] at com.curalate.redis.test.RedisTests$$anonfun$1$$anonfun$17.apply(RedisTests.scala:151)
[info] at com.twitter.util.Future$$anonfun$map$1$$anonfun$apply$3.apply(Future.scala:1145)
[info] at com.twitter.util.Try$.apply(Try.scala:15)
[info] at com.twitter.util.Future$.apply(Future.scala:163)
[info] at com.twitter.util.Future$$anonfun$map$1.apply(Future.scala:1145)
[info] at com.twitter.util.Future$$anonfun$map$1.apply(Future.scala:1144)
[info] at com.twitter.util.Promise$Transformer.liftedTree1$1(Promise.scala:107)
[info] at com.twitter.util.Promise$Transformer.k(Promise.scala:107)
[info] at com.twitter.util.Promise$Transformer.apply(Promise.scala:117)
[info] at com.twitter.util.Promise$Transformer.apply(Promise.scala:98)
The Redis monitoring output tells the tale, I think. You can see that the MGET executes within the MULTI block, thus returning the QUEUED response. My callback on the MGET function in the code then interprets that response as successful, but the underlying type is not a Seq.
Let me know what else I can do to help!
Nice detective work!
It does indeed look pretty clear cut. I'm not super familiar with redis, but is it also possible to view the server's response to each of those?
A fix for this is now on our roadmap internally, although I'm uncertain of when it will be finished. The immediate workaround is to allow the transaction to complete before issuing another command:
client.transaction(cmds).flatMap { _ => client.mGet(....) }.map { results => ... }
If you'd like/have time, we're always accepting pull requests! My starting point would be examining RedisPool and seeing what's going on with transactions there.
@chrisbenincasa all of our finagle clients are set up to handle one response per one request, and it seems that for these transaction requests, that one-one mapping no longer holds, hence the mismatched requests/reponse pairs, and the ClassCastException. There currently aren't any plans to fix this behavior, but please feel free to submit a PR!
@jcrossley I'm not sure I follow -- the single QUEUED response comes from a single MGET request. It's not that the response are mismatched (unless you mean that the eventual result for an MGET issues during a transaction ends up as a part of the response for an EXEC request), but rather that the single QUEUED response is cast opaquely to Unit. In other words, consumers have no way of knowing if the single response is actually the type they expect, if the "raw" response from Redis is a QUEUED type.
As a quick follow-up, I'm certainly interested in taking on the change here. Though, I'm curious as to what your opinions are on the correct solution. To some degree, one could argue this is user error for using Redis transactions in a multi-threaded application environment; so you could consider this a sort of error state (maybe a specialized Exception to let the caller handle the QUEUED case?). On the other hand, the response isn't technically an error, so an exception may not fit the bill.
When Redis returns a
QUEUED
response, the returned value is opaquely cast which causes runtime errors.Expected behavior
I'm not sure what the right call here is. On one hand, the call to Redis was successful, so failing the Future doesn't seem right. On the other, returning any instance of
T
also seems wrong and changing everything toOption[T]
would be an even larger change. One option is to wrap all results in some "Response" class to provide additional information about the result.Actual behavior
On
QUEUED
from Redis, the result is cast to aFuture[Nothing]
(presumably to get the types working) fromFuture.done
, which ends up being aConstFuture(Return(Unit))
. Callers see the Future as successful, however do not know the underlying value is a BoxedUnit, which leads to ClassCastExceptions when the value is passed to functions.Example error:
Steps to reproduce the behavior
Pseudo-code