spring-projects / spring-data-redis

Provides support to increase developer productivity in Java when using Redis, a key-value store. Uses familiar Spring concepts such as a template classes for core API usage and lightweight repository style data access.
https://spring.io/projects/spring-data-redis/
Apache License 2.0
1.76k stars 1.16k forks source link

CrudRepository.findByIdOrNull() sometimes returns null even when there is a value #2515

Closed dlwhdals987 closed 1 year ago

dlwhdals987 commented 1 year ago

val kotlinVer = "1.7.0"
id("org.springframework.boot") version "2.6.1"
id("io.spring.dependency-management") version "1.0.11.RELEASE"
@Repository
interface PlayerRepository : CrudRepository<Player, MemberId> {}

playerRepository.findByIdOrNull() --> If you use it like the above, there will be cases where there is no data even if there is data

If there is a delay due to the large number of requests, sometimes the result is null.

Internally, it seems to use the dispatch(<T> RedisCommand<K, V, T> dispatch(RedisCommand<K, V, T> command)) method in the io.lettuce.core.api.StatefulConnection.interface. Is it related?

Is there a way to make the data reconciliation come out non-null?

jxblum commented 1 year ago

Hi @dlwhdals987-

DISCLAIMER: I am no Redis expert and I am still learning the ins and outs of SD Redis as well as Redis.

Regarding:

"If there is a delay due to the large number of requests, sometimes the result is null."

This is somewhat relevant to your ask (corrected & simplified):

Internally, it seems to use the <T> AsyncCommand<K, V, T> dispatch(:RedisCommand<K, V, T>) method in the io.lettuce.core.api.StatefulConnection interface. Is it related?

In short, YES.

However, let's be very clear here, and start at the beginning.

As you are probably already aware, Spring Data Redis builds on Spring Data KeyValue. Therefore, SD Redis's (default) CrudRepository implementation is actually SD KeyValue's SimpleKeyValueRepository class. Thus, when you (indirectly) invoke the findById(:ID) operation on CrudRepository, the following (~) sequence of calls will happen:

playerRepository.findById(memberId)
  -> SimpleKeyValueRepository.findById(memberId, Player.class):Optional<Player>
  -> SimpleKeyValueRepository.execute(:KeyValueAdapter):Optional<Player>
  -> RedisKeyValueAdapter.get(memberId, keyspace, Player.class) // a SD KeyValue KeyValueAdapter interface impl
  -> RedisTemplate.execute(:RedisConnection)
  -> DefaultedRedisConnection.hGetAll(key) // key is the memberId in binary form (byte[])
  -> LettuceHashCommands.hGetAll(key)
...

Now is the moment to pause and observe the implementation of the LuttuceHashCommands.hGetAll(key) method. You will notice 2 things.

First, on line 103, notice connection.invoke().... LettuceConnection's invoke() method (here) returns an "asynchronous" connection.

Second, the use of the "asynchronous" connection also corresponds to the invocation of the RedisHashAsyncCommands::hgetall "asynhronous" operation (command), also on line 103. Inside RedisHashAsyncCommands.hgetall(..) is also where we find the dispatch(..) method call to which you are referring.

The connection.invoke() method returns a SD Redis LettuceInvoker instance, where the just(..) method will be called with the command to execute (an asynchronous invocation). As you can see, this makes use of a Synchronizer, here to invoke (and handle) our (asynchronous) operation/command (i.e. RedisHashAsycnCommands.hgetall(..)).

The Synchronizer implementation was provided by the LettuceConnection back in the doInvoke(..) method, here. It is the 2nd argument to the LettuceInvoker constructor, with conditional construction based on whether we [opened a pipeline](https://docs.spring.io/spring-data/redis/docs/current/api/org/springframework/data/redis/connection/lettuce/LettuceConnection.html#openPipeline()), are [inside a transaction](https://docs.spring.io/spring-data/redis/docs/current/api/org/springframework/data/redis/connection/lettuce/LettuceConnection.html#multi()), or we are simply executing the command and wish to "wait" for the result, here, and specifically here, then here.

Let's assume you are using the default execution path (i.e. not in an open pipeline or transaction (referred to as isQueueing() or isMulti by the LettuceConnection)). Then the Lettuce driver provides a means to handle the RedisFuture using a timeout (here) that you configure on the connection, and in this case, the LettuceConnection.

SOLUTION:

I think you can configure the timeout for operations on the LettuceConnection using the LettuceConnectionFactory.

For example:

@Configuration
class ApplicationRedisConfiguration {

  @Bean
  LettuceConnectionFactory redisConnectionFactory() {

    return new LettuceConnectionFactory(<redisConfigurationImpl>, LettuceClientConfiguration.builder()
      .commandTimeout(Duration.ofSeconds(30L));
  }
}

The <redisConfigurationImpl> could be any of [ RedisConfiguration, RedisClusterConfiguration, RedisSentinelConfiguration, RedisStandaloneConfiguration ] depending on your Redis topology arrangement.

See LettuceConnectionFactory constructors in Javadoc for more details.

See the LettuceClientConfiguration Javadoc for more details.

Based on the source of the LettuceClientConfiguration, it also appears that the timeout can be possibly configured in the Lettuce RedisURI (i.e. the connection URL used to connect to your Redis server(s)/cluster) as well. So you might want to double check how this is configured in your respective environments (DEV, TEST, PROD) as you may not see the same behavior in different environments since the configuration might vary significantly.

In summary...

Although, I am not 100% certain (I have not tested), I think you might be seeing null return values when the server(s)/cluster is loaded because 1) the operations/commands invoked inside SD Redis are executed "asynchronously" with the Lettuce driver and 2) the operation is timing out when the servers are loaded (due to a large number of requests). Therefore, I think you need to tune your client connection and possibly the Redis servers in your cluster, in this case.

The asynchronous invocation of Redis commands inside SD Redis is not something that is likely to change either.

Let me know if this helps or not.

Cheers!

Hwan-seok commented 1 year ago

Hi @jxblum! Thanks for the answer 😄

Seems the default command timeout is 60 seconds based on here. So I don't understand that changing command timeout from 60s to 30s makes different in this situation.

Could you explain a little more details about command timeout? Thanks!

mp911de commented 1 year ago

If you use it like the above, there will be cases where there is no data even if there is data

It would be good for us if you could provide us with a minimal reproducer along with either an rdb file or a redis script to populate the redis data so we can investigate at which point a mismatch happens. Alternatively, you now have guidance in terms of invoked methods that you could debug yourself.

The asynchronous invocation of Redis commands inside SD Redis is not something that is likely to change either.

Lettuce uses an event-loop based I/O approach as general mode of operation. The synchronous/asynchronous/reactive APIs are merely a facade towards on top of the event loop. We simplified the design of using Lettuce over the years from mixed sync/async towards using the asynchronous API only. Previously, when we used the synchronous API, Lettuce enforced the synchronization timeout. Now Spring Data Redis applies the command timeout to the RedisFuture when waiting for a result.

jxblum commented 1 year ago

Hi @Hwan-seok -

Regarding...

Seems the default command timeout is 60 seconds based on here. So I don't understand that changing command timeout from 60s to 30s makes different in this situation.

When I traced through the code I did not look at the initial configuration of the command timeout (i.e. 60 seconds).

In my example, I arbitrarily picked 30 seconds since 1) I figured the actual command timeout used by Redis client drivers (e.g. Lettuce or Jedis) would be much less and 2) 30 seconds typically corresponds to many request timeouts used in other systems of communication.

Sorry for the confusion.

Additionally, I think it is important to consider the use case(s) and requirements (even SLAs) in order to tune the system (clients and servers) accordingly. At the risk of stating the obvious, there is no 1 size fits all in every situation. You have to balance load with responsiveness in the application. Too long of a timeout and users are going to be impacted. Too short of a timeout and the application will need to adjust in flight. Either way, it requires work on the application/system and critical design choices will need to be made.

Finally, I am still reasonably confident that, given the asynchronous nature of the Redis command execution (as @mp911de more accurately described above), execution of the Redis command in certain instances is taking longer than expected and the RedisFuture does not yet contain a value. It really is no different than Java's Future.get(timeout, :TimeUnit) operation that returns the asynchronous computed result "if available".

1 more thing...

Arguably, the RedisFuture.await(..) call in the stack I demonstrated above could 1) throw a (configurable)TimeoutException like Future.get(..) or 2) the call could simply be replaced with RedisFuture.get(timeout, :TimeUnit) in SD Redis's LettuceConnection (perhaps). Since RedisFuture already extends Future then I would not change the contract of await, personally.

Hwan-seok commented 1 year ago

Thanks for the very detailed explanation @mp911de, @jxblum !

The reason why I couldn't provide a minimal reproducible example is it didn't really occur when I tried to reproduce it only using findByIdOrNull().

But I managed to find out it is because of a combination of findByIdOrNull() and the save().

The save() method internally uses the put and it first dispatches the del command and then re-set it by hmset then finally updates its indexes. Theses sequence of commands is not tied up to the multi or a transaction. So it is not safe in the race condition. Let's assume dispatching the command hmget(findByIdOrNull) in the middle of thedel and hmset(save). It may return null.

Sorry for the confusion.

Maybe the save() has these commands by design so I assume this issue is not a bug but normal behavior. But IMHO, it would be better if they are tied up with like multi or something so that we can confirm that it is thread-safe. What do you think?

jxblum commented 1 year ago

Hi @Hwan-seok -

I apologize for the lapse in communication; I got tangled up in other things.

First, thank you for sharing your findings and clarifying the technical details in your use case. This makes more sense. So, let's dig in.

Regarding...

"Maybe the save() has these commands by design so I assume this issue is not a bug but normal behavior."

Again, I am no Redis expert (I have much to learn yet), and the code in SD Redis was written well before my time, but after careful review and consideration, I do think the sequence of Redis commands in the RedisKeyValue.put(..) operation used by CrudRepository save(..) was definitely deliberate and by design.

Although, I am not 100% certain that every Redis command, like DEL in the put operation, was necessary in this case (particularly given the HSET command, or rather the now deprecated, but still used HMSET command, "overwrites" the data), it also seems to me that the sequence of commands in put, or even if put were implemented as only a single command (e.g. just HSET), would not matter anyway.

NOTE: Minimally, it seems logical to me the object would need to be 1) updated (saved) at the very least (obviously) 2) eviction triggered (possibly necessary for some notifications, such as SD Events? or session management) and 3) updating the index on the data (structure) to access the updated object efficiently.

NOTE: I also don't think SD Redis Repository impl is smart in that it only updates fields in the hash identified by the key (entity ID) for properties/fields (state) in the object that changed. I could be wrong, but I don't believe so.

Referring back to something @mp911de stated above in his comment, "Lettuce uses an event-loop based I/O approach as general mode of operation. The synchronous/asynchronous/reactive APIs are merely a facade on top of the event loop. We simplified the design of using Lettuce over the years from mixed sync/async towards using the asynchronous API only."

Anytime you have compound actions (in the same call or across calls) that are not atomic, there is always the possibility for a race condition.

So, even if the put operation consisted only of the HSET command and your application immediately followed the call to Repository save(..) with Repository findByIdOrNull(..), AND even if your application (or test) were to perform both Repository operations in the same Thread, it is still possible to get null values because, the fact of the matter is, there will always be multiple Threads involved between the 2 Repository operations (since Lettuce uses asynchronous invocations for each of the HSET (in save(..)) and HGET (in findByIdOrNull(..) commands involved) and you simply don't know whether the save(..) (HSET) technically succeeded or not until the client receives an ACK from the server, upon which you are waiting on because of the async invocation of Redis commands in Lettuce in the first place.

Therefore, the following sequence of ops (and associated commands) can still fail even when performed by the same Thread: save(..) using async(HSET) followed by findByIdOrNull(..) with an async(HGET).

The following order of operations/commands would absolutely need to happen to get a non-null response: save(..) using async(HSET), client receives ACK from server for HSET, then findByIdOrNull(..) with the async(HGET) happens.

NOTE: Ever is the possibility of race conditions present when the success of the operations involved depends on good timing.

Even then (with an ACK), there is still a possibility of a null return value if you think about this from the context of a clustered Redis topology, where the read could come from a Replica. The replication of values between Masters and Replicas in Redis is (always?) asynchronous. So, whether single-Threaded or multi-Threaded/multiple processes (multiple instances of your application) is involved, the problem could still manifest itself.

Regarding...

"it would be better if they are tied up with like multi or something so that we can confirm that it is thread-safe"

Initiating a transaction isn't necessary in every use case and will cause additional overhead and latency in use cases where more stringent consistency guarantees are not required. You can certainly do so though by simply enabling and using Spring's Transaction Management infrastructure and support with SD Redis (see here).

However, the point I was trying to make above is that it is not going to be Thread-safe given the async nature of the command execution, as applied by Lettuce anyway (I am not certain about Jedis), especially without explicit coordination (like transactions), which should be an opt-in approach given it is highly application requirement/SLA dependent.

Other things to consider or try:

1) I am not sure if pipelining would help much in this case. I kind of doubt it, but it came to mind as I was reviewing the code.

2) You still have to be very much cognizant of a clustered Redis deployment and the (configurable?) asynchronous replication between nodes (Redis servers), and where read operations (can) originate, which might have an affinity based on locality.

3) What happens with Jedis in this scenario I wonder?

4) There may be other applicable configuration options to consider, and tuning plays a big role particularly when given different loads (at different times) in different server/cluster topology arrangements, and so on and on. You have to carefully consider PROD since your DEV environment probably looks nothing like PROD, too.

5) Test the use of Spring's Transaction Management here. I think this should prove invaluable, maybe in this particular use case, or you application at large, particularly if consistency is important. But, remember...

6) (As I am certain you are well aware), Redis is an eventually consistent system by design and Redis does not offer strong consistency guarantees, particularly in a clustered topology arrangement (see here, and search for "Redis Cluster consistency guarantees").

7) For cases requiring strong consistency, then or more suitable data store (e.g. Hazelcast) may be necessary. Of course, I can understand and even concur, the strong desire and need to get as much mileage out of the chosen, existing and familiar data store as possible.

8) ???

Given there is no one size fits all in this case, and a lot has to do with the driver (Lettuce) you chose in this case, I am not inclined to say anything is necessarily wrong with Spring Data Redis (and it's Repository infrastructure) extensions in this case.

I hope I made sense.

jxblum commented 1 year ago

This may have boiled down to a philosophical discussion, and given my lack of experience with Redis, I am not sure I am being that helpful or can offer any additional insights into this case. Hopefully, I have given you something to think about.

In any case, I don't think this will lead to changes in Spring Data Redis ATM given I am not convinced Spring Data Redis is necessarily doing anything wrong by design. I also feel like there are options (e.g. using transactions) that should prove useful in this case.

I am always open to receiving any additional feedback. (This has been a great learning opportunity for me).

Thank you.

Hwan-seok commented 1 year ago

Thanks for the detailed response again @jxblum 😄

Me neither an expert on the SD nor a Redis. You must be way more expert than me.

I agree with what you mentioned:

consisted only of the HSET command and your application immediately followed the call to Repository save(..) with Repository findByIdOrNull(..), AND even if your application (or test) were to perform both Repository operations in the same Thread, it is still possible to get null


Although, I am not 100% certain that every Redis command, like DEL in the put operation, was necessary in this case (particularly given the HSET command, or rather the now deprecated, but still used HMSET command, "overwrites" the data), it also seems to me that the sequence of commands in put, or even if put were implemented as only a single command (e.g. just HSET), would not matter anyway.

I also don't think SD Redis Repository impl is smart in that it only updates fields in the hash identified by the key (entity ID) for properties/fields (state) in the object that changed. I could be wrong, but I don't believe so.

Totally agree at least IMHO. But neither HSET nor HMSET does not overwrite entire hashes but only upserts. So probably the del in the save seems inevitable.

Also, I have some thoughts which are almost same aligned with the above:

In any environment such as with a different replica or driver, FWIW, let's define the case that a hash was already been set. It's natural that save() and findByIdOrNull() can be racing toward the hash but in my humble opinion, findByIdOrNull() should at least give a previous value. Imagine the following sequence of operations and let's say save(2) and findByIdOrNull() is fired at a similar point in the timeline.

Thread 1: save(1)---done(1)---save(2)
Thread 2: ---------------------findByIdOrNull()

In this race condition, I don't expect the result of findByIdOrNull() to be 2. But at least 1 or 2. Not null. But it can be null in the current implementation because it deletes the hash completely and then re-sets it.

But I don't think there is an operation that overwrites the entire hash. It's only for individual fields. Also, the SD Redis cannot easily compare current fields with updating fields locally because there are distributed applications that are using SD Redis and those entities are different in each application. And I assume that save() means "Hey, I don't care what state of actual current value but overwrite by this value!". So the conclusion is.. it could be done by... yes, putting them in the transition(MULTI). But It should be done by users.

Anyway, thanks for your kind help in figuring out this 😃

jxblum commented 1 year ago

Hi @Hwan-seok -

Regarding the scenario you nicely demonstrated:

Thread 1: save(1) --- done(1) --- save(2)
Thread 2: ------------------------ findByIdOrNull()

... expecting to see the previous value (1) from the first (completed) save(1) before the call to findByIdOrNull(..), particularly if the second save(2) has not completed yet (i.e. is "in the middle") or threw an Exception ... is a great point!

Given the current implementation of KeyValueRepository save(..) in SD Redis, without transactions, another Thread calling findByIdOrNull(..) can definitely witness in-flight operations given save(..) executes (minimally) 3 separate Redis commands in the implementation. And, we have come full circle. 😄

So, I agree with:

So the conclusion is.. it could be done by... yes, putting them in the transition(MULTI). But It should be done by users.

Correct!

And, a transaction can be initiated when entering any @Transactional annotated application service/DAO method, or programmatically when using TransactionTemplate (as described in the docs). Either way, any Thread, or even another process in a distributed application using the Redis cluster topology, should minimally see the previous value (i.e. "1" from save(1) in the scenario you demonstrated above) when transactions are used appropriately.

I stand corrected on the hset Redis command (or hmset for that matter). Indeed, it only takes fields/values of the hash, so deleting the previous entry is required given SD Redis has no means of tracking what state in the object, subject to the save(..), has changed.

You are welcome for the help. I really appreciated and enjoyed our conversation.

Good luck!

jxblum commented 1 year ago

Hi @Hwan-seok -

I was just thinking about 1 more thing I remembered seeing in the execution path of Repository save(..) that I wanted to call to your attention; this. The (end-of-line) comment makes it sound like, if the operation were to fail, then you would at least see the previous value (stored under a "phantom key") once restored??

NOTE: In other words, given enough calls to findByIdOrNull(..) then it would either return the new value or the previous value if the save ultimately fails. Seeing an (intermediate) null value is an unfortunate side effect of the race condition given the implementation of save using del and hset. Of course, permanently seeing null implies that 1) the Redis server (node in the cluster) failed after del and before hset, and now 2) there is data loss, even in a durable setup.

I am not sure if shadow copying is something SD Redis is doing (possibly; see note below) or Redis itself, but appears to be enabled with configuration (i.e. keepShadowCopy()).

NOTE: I don't recall ever seeing a shadow copy (of the value?) being made in the execution path, stored under a phantom key, during save(..), before the put(..) operation in RedisKeyValueAdapter is called. I am not even certain what this block is really attempting to do. I will need to get some guidance from the team (@mp911de or @christophstrobl) on this one.

I have also not tested my claim above, so I am not certain I am even correct about restoration of previous values when an attempted save(..) runs amuck. All I know is, it does seem dangerous if a del can happen without some form of restore if hset were to fail all because save(..) uses multiple Redis commands (a recipe for data loss in this scenario). It really makes the comments you made earlier hit home for me.

Still, this can easily happen in other scenarios as well (such as, when a write happens, but the master node fails before the value can be replicated in an HA arrangement).

However, 1 thing is for certain, either way, it definitely won't solve the race-condition in a multi-Threaded context either, as I am sure you are quite aware.

Hwan-seok commented 1 year ago

Due to the lack of my understanding, I couldn't find any restoration process related to phantom keys 😅. Perhaps another part of the SD Redis seems using them?

Indeed.

All I know is, it does seem dangerous if a del can happen without some form of restore if hset were to fail all because save(..) uses multiple Redis commands (a recipe for data loss in this scenario).

Even though this could be directly related to a race condition, it could be another issue if there is no process to restore deleted values.

As you mentioned, although, the phantom key does not help with a race condition 🤣

However, 1 thing is for certain, either way, it definitely won't solve the race-condition in a multi-Threaded context either, as I am sure you are quite aware.

jxblum commented 1 year ago

Hi @Hwan-seok -

I was doing more digging ("research", 😄 ) on this issue today and came across this source in SD Redis.

Essentially, if your application domain model object (entity) extends the PartialUpdate class (Javadoc), then SD Redis will update the associated Redis hash for only the properties of the entity that have changed.

This should be the effect of the RedisKeyValueAdapter.update(..) method, here. The rdo.getBucket().rawMap() method should return a Map of only field/values (properties) of the entity that have actually changed, passed to the hset command, thereby updating the Redis hash identified by key (keyspace-based entity ID) for the entity. The RDO (RedisData object) is created (and initialized) from the PartialUpdate object passed to update(..).

NOTE: This is in contrast to the RedisKeyValueAdapter.put(..) operation (here) that constructs/intializes a RedisData object (RDO) from the entire object (entity) being saved.

In addition, there is no prior del command being issued in the RedisKeyValueAdapter.update(..) method to clear the state of the object/entity from the Redis hash prior to saving the changed/updated object/entity.

Furthermore, this should also eliminate any data loss as well as the strict need for wrapping Repository operations, such as save(..) and findByIdOrNull(..), in a transaction (which SD Redis implements with the MULTI command followed by either EXEC or DISCARD), unless strict consistency and durability/reliability are required.

Unfortunately, SD Redis's Repository save(..) method is NOT implemented to take partial updates into consideration. 😞

NOTE: On the flip side, an unfortunate side effect of using partial updates is having to extend SD Redis's PartialUpdate class in your application domain model class, coupling your application to SD Redis.

If we trace this through, we'd see that SD Redis's Repository extension primary builds on SD KeyValue's base Repository infrastructure. That is, RedisRepositoryFactory extends KeyValueRepositoryFactory. Therefore, the implementation of the application declared Repository interface would be backed by SimpleKeyValueRepository (here).

NOTE: Most Spring Data modules, such as SD MongoDB, either 1) don't build on the SD KeyValue module naturally, or 2) will override the getRepositoryBaseClass(:RepositoryMetadata) method and return a store-specific implementation (for example).

NOTE: Of course, SD KeyValue contains a nice foundation for key/value stores, like Redis (Hazelcast (AFAIR) extends SD KeyValue as well), and most of the implementation details are encapsulated in, and delegated to, the KeyValueAdapter implementation (Javadoc), which in SD Redis, is the RedisKeyValueAdapter (Javadoc), which has been the primary focus of our conversation.

SD KeyValue's SimpleKeyValueRepository.save(..) method (source) calls the KeyValueOperations.update(ID, <entity of type T>) method (here) directly.

If save(..) had rather called KeyValueOperations.update(<entity of type T>) (Javadoc) instead, the outcome would be different. This is because SD Redis's implementation of the SD KeyValue KeyValueOperations interface is RedisKeyValueTemplate (Javadoc). And, if we look at the update(<entity of type T>) method implementation, we see that it would handle "partial updates" by calling the RedisKeyValueAdapter.update(:PartialUpdate) method (source).

However, and again, since SimpleKeyValueRepository.save(..) called the KeyValueOpertions.update(ID, <entity of type T>) method directly (source), RedisKeyValueAdapter simply passes the call through (source) to the default implementation provided in SD KeyValue's KeyValueTemplate (source), which would just call the [Redis]KeyValueAdapter.put(..) method, here bypassing the partial update.

I am not convinced this was by design.

jxblum commented 1 year ago

@mp911de & @christophstrobl -

After a complete and thorough analysis of this issue, I am convinced (and conclude) that either:

1) SD KeyValue's SimpleKeyValueRepository.save(..) method (source) is calling the wrong KeyValueOperations.update(..) method (Javadoc), that is update(ID, <entity of type T>) rather than, and more appropriately, update(<entity of type T>), or...

2) SD Redis should provide a custom extension of SD KeyValue's SimpleKeyValueRepository class (or a custom implementation of SD Common's ListCrudRepository<T, ID> interface, in general), overriding the save(..) method to call the appropriate KeyValueOperations.update(..) method, thereby enabling partial updates to be taken into consideration.

NOTE: See my comment above for complete details.

If we were to change SD KeyValue's SimpleKeyValueRepository.save(..) method to instead call KeyValueOperations.update(<entity of type T>), it would still 1) ultimately call KeyValueOerations.update(ID, <entity of type T>), resolving the entity ID in similar manner as before, and would 2) allow SD KeyValue module extensions, like SD Redis, to override and customize the default behavior to handle special concerns, like partial updates in the case of Redis.

While this does not eliminate all possible race conditions, I do think it would solve the null intermediate values race condition and prevent data loss. Again, I would still encourage users that require strict consistency along with durability and/or reliability to use transactions in this case.

By modifying SD KeyValue, it would keep all SD KeyValue module extensions consistent in behavior.

However, if there is any concern of adversely affecting SD modules extending from SD KeyValue, then option #2 presented above could be an acceptable choice. Although, to my knowledge, only SD Redis and SD Hazelcast extend from SD KeyValue AFAIK.

WDYT?

jxblum commented 1 year ago

Per the discussion above, the SD Redis Repository save(..) method is by design, using the DEL command followed by the HMSET command.

There is no guarantee, especially in a multi-Threaded (concurrent) or distributed (clustered) environment that the Repository save will be atomic, unless the user opts to use (Redis) Transactions, which are supported by Spring Data Redis using core Spring Framework Transaction Management infrastructure. More information on using Redis Transactions in a Spring context can be found in the SD Redis ref doc, here.

Given there is not further activity on this Issue ticket, and SD Redis functions as designed, I am closing this ticket.