etaty / rediscala

Non-blocking, Reactive Redis driver for Scala (with Sentinel support)
Apache License 2.0
790 stars 142 forks source link

Read-Modify-Write transactions #26

Closed dlangdon closed 10 years ago

dlangdon commented 10 years ago

Sorry if this is obvious but I could not find any example in the transaction spec or other documentation that shows how to handle futures for transactions that need to read in order to decide what to do inside the transaction.

Let's say I want a transaction that read the value of A, checks for a condition on it and if it is fulfilled modifies a value for B. In plain redis I would use WATCH, GET A, MULTI, SET B and EXEC.

In rediscala watch is part of the transaction, and It does not return its own future. So I'm unsure how to proceed. I need to know when the watch has been received before doing GET and I need to keep watching and I can't execute the whole transaction since I'm still building it. I'm not sure if I should have two separate transactions, but the session semantics are not present, so I'm unsure if the watches will still be valid then.

So overall, how do you do this kind of thing?

Another example: http://stackoverflow.com/questions/10750626/transactions-and-watch-statement-in-redis

etaty commented 10 years ago

I understand your use case. In rediscala, you can't read inside a transaction, because you will be blocking the client for other requests. I suggest you try to look if you can do your check in a LUA script. (transform the transaction in a lua script)

lucaong commented 10 years ago

First of all thanks for the great rediscala :)

Was this closed because it is not implementable or not advisable to implement?

I might be missing something, but In Redis you can implement optimistic locking this way (pseudocode, variable assignment and arithmetic would not really work in the cli):

WATCH foo
current = GET foo
MULTI
SET foo + 1
EXEC

The problem is that in rediscala, I can see no way of sending the watch and making sure it is processed before proceeding. It looks to me, also as suggested by @dlangdon, that this could be accomplished by having watch and exec both return a future. It would enable optimistic locking of the sort:

for {
  watch    <- redis.watch("foo")
  original <- redis.get("foo")
  updated  =  original + "baz"
  result   <- redis.multi().set("foo", updated).exec()
} yield result

If the future returned by exec would fail because of race conditions, one could retry the whole thing. Am I missing something? This would be incredibly useful. In case this is feasible, I will try to get accustomed to rediscala source and see if I can help on this.

dlangdon commented 10 years ago

From what I understood from @etaty the problem is a mismatch on session semantics and blocking. (have not verified exactly what the actual code does so take this with a grain of salt). The issue is that redis has blocking sessions.

In the scala client, each request opens a connection, sends the call, and returns a future. When there is a reply from redis the future is completed, callbacks called and the connection freed. While this happens, any new call will work on a separate session and the client itself will not block. You could, I guess, run out of connections in the pool, which is a risk when your code is going ahead at full speed while previous calls might be slower to finish.

Transactions are apparently implemented as an aggregation of commands that are then executed one by one by the TransactionManager in a single session, after exec() is called, completing futures as it goes. So, while the transaction might LOOK asynchronous, it is blocking all the time. YOUR scala code knows nothing of this though, as it works on the futures as they are completed, not blocking.

If this interpretation is correct (and I will leave it to @etaty to validate hehe) then IMHO the issue is the way the TransactionBuilder works. The only thing needed to execute code like the one you mention is to be able to refer to the same session you had when you sent the watch command, and to be able to make multiple calls on it on each future, finally making a call that frees the entire thing.

This is in practice the ability to make redis calls synchronously one after the other, even from async code (like your example). The risk is in a single connection being blocked for a long period, especially if YOUR code takes a while to do its stuff.

IMHO, if the tradeoffs are well explained, I would like this alternate approach as it would provide better semantics than the actual blocking client.

etaty commented 10 years ago

It is closed because not advisable to implement You should use http://redis.io/commands#scripting You can even do your task in the example with one command : http://redis.io/commands/setrange

@dlangdon your are almost correct

Transaction cmd are sent in a batch (watch multi cmds exec). You need to call exec to sent the batch. A transaction in rediscala is only useful :

ezsi commented 9 years ago

I totally share the point of @lucaong, the WATCH message should be sent before entering into the transaction. In my view it protects a key from concurrent modification between the WATCH call and the execution of the transaction. WATCH takes a snapshot of the current state of a key and when the transaction updates the key then it can do a compare and fire exception if not. So the timing of the WATCH is important!

After looking into the code it turns out that you can manually send the WATCH to redis.

watch <- redis.send(redis.api.transactions.Watch(Set("foo")))

I'm not sure about the drawbacks of this workaround but it seems working.