spring-projects / spring-data-geode

Spring Data support for Apache Geode
Apache License 2.0
52 stars 39 forks source link

findById() returns the wrong data [DATAGEODE-388] #433

Closed spring-projects-issues closed 3 years ago

spring-projects-issues commented 3 years ago

Claudiu Balciza opened DATAGEODE-388 and commented

https://github.com/claudiu-balciza/SpringBootDataGeodeFindByIdError/tree/develop

I'm getting weird data back from Geode 1.13 when using spring data for geode findById()

This is proof of concept code

The logic is implemented in MyRunner.java

The class is simple, the data is stored correctly in Geode 1.13

The region is created like this: gfsh> create region --name=TestRegion --type=PARTITION_REDUNDANT_PERSISTENT --recovery-delay=10000 --disk-store=DataPersistence --enable-statistics=true --eviction-action=overflow-to-disk --compressor='org.apache.geode.compression.SnappyCompressor' --redundant-copies=1

Issue: findById() returns the wrong data while findAll() works fine


Reference URL: https://stackoverflow.com/questions/64668441/apache-geode-crud-repository-findbyid-returns-the-wrong-array

spring-projects-issues commented 3 years ago

yozaner1324 commented

This is actually Apache geode behavior rather than a Spring specific thing. The problem is that the local region is holding on to things by reference and the objects m and l are being reused for every entry. These values are only seen when looking up entries by id because findAll() and queries go straight to the server rather than checking the local region first

spring-projects-issues commented 3 years ago

yozaner1324 commented

Corresponding GEODE Jira: https://issues.apache.org/jira/browse/GEODE-8733

spring-projects-issues commented 3 years ago

John Blum commented

Thank you yozaner1324 for the analysis and tracking down the underlying (root cause of the) problem.

To close this ticket off (tie-up), I wanted to share what I shared internally, concerning this particular problem, and why it is specifically a GEODE issue, and not caused by Spring (Data)!

In a nutshell, the difference between CrudRepository.findAll() (specifically, the GemfireRepository); SDG's implementation of the Spring Data CrudRepository interface for Apache Geode/VMware Tanzu GemFire] vs. CrudRepository.findById(..) is that...

  1. findAll() ultimately uses an Apache Geode OQL Query to get the object by using the Geode OQL Query API, and specifically the QueryService to construct an OQL Query, execute it and get back the SelectResults, from which the object is extracted, where as...

  2. findById(..) invokes the Region.get(key) operation directly.

Keep in mind that SDG provides a base implementation of the CrudRepository basic CRUD (e.g. findById(..), save(..)) and simple (OQL) query operations (e.g. findAll()).

This becomes apparent when you look into the SD[G] Repository infrastructure implementation.

First, your SDG application-specific GemfireRepository interface extension is backed by SDG's SimpleGemfireRepository class. Most SD Repository interfaces supported by the different Spring Data modules specific to backend data stores (e.g. GemFire/Geode, MongoDB, Redis, etc) provide a base/default, "simple" Repository implementation for the basic CRUD and simple query data access operations, of course, for the SD modules that support the SD Repository abstraction in the first place.

The provided, simple Repository implementation, in good Spring fashion, delegates to a Template (just as JDBC operations are encapsulated Spring's JdbcTemplate class). In SDG's case, that would be the GemfireTemplate class.

So, if we peer into the SDG source code, we see that for findAll(), the SimpleGemfireRepository delegates to the GemfireTemplate, which executes an OQL Query.

However, for findById(..), we see that SimpleGemfireRepository delegates to the GemfireTemplate, which ultimately executes the Region.get(key) operation (here).

So, it is for this reason why Geode is the culprit in this case, hence GEODE-8733.

Hopes this helps to clarify the matter