oracle / coherence

Oracle Coherence Community Edition
https://coherence.community
Universal Permissive License v1.0
427 stars 70 forks source link

Possible to insert data directly to front cache without replication to back cache? #129

Closed javafanboy closed 2 months ago

javafanboy commented 2 months ago

After having done a, as I think so far, successful test of saving/restoring data from partitioned back tier of a near cache to/from S3 (seem possible to get near half the aggregated link speed of all storage enabled nodes - the other half goes to replication as we uses a backup count of 1) I am thinking if it for the same purpose (i.e. geting test environments up and running as fast as it it is ever possible) is technically possible to write data (in this case of course as Java objects not POF data) to front caches without triggering replication to the back cache (afaik there is no "getFrontCache" but perhaps there is a way to get to the "backing map" directly also on the clients)? I have found that the front map can be accessed by casting the cache to CacheMap but inserting objects to the front map will still replicate to the backing map so that alone does not seem to make what I want possible... I guess I would like to do a "synthetic" put if that was possible.

If this was possible one could create an S3 backup of the front cache in the client with the highest hit rate that, when new test environments are created, is restored on every non-storage enabled node while a corresponding L2 (back tier) snapshot is loaded...

With a lot of L1 (front) caches loading them all at startup with hot content takes a fairly long time for us (and puts HEAVY load on the L2 back tier). For context we need to pre-load L1 as the object in our application are not read in bulk but rather discovered little by little (i.e. each object contain keys to other objects forming a type of "graph") resulting in potentially tens of thousands of individual cache misses (each causing a remote request to the back cache) with an empty L1 front cache - i.e. if not doing pre-loading quite many client requests will take many seconds or even a minute or more to complete resulting in unacceptable user experience (or even HTTP timeouts for service invocations) :-(

Today we first load more or less our entire database to the L2 back tier (for test environments till will be replaced by the loading from S3) and then load a large number of frequently used objects on each L2 front cache that I also would like to replace with parallel direct loading from S3 to each non-storage enabled node...

aseovic commented 2 months ago

This would defeat the purpose of the near cache, which is to invalidate and evict entries in the front maps whenever they change in the backing map. With this in mind, what would having an entry in the front map that doesn't exist in the backing map even mean!?

If you want a "front/local map" on the client that is disconnected from a back cache, your only option is to use local-cache directly.

javafanboy commented 2 months ago

Maybe I explained badly - my thought was to at the same time load the back cache with the full dataset (I already have the code for that) and the fromt caches with subset of the data loaded to the back cache (that I know are the most used ones) and from the point that finishes use the near cache just like any near cache. As the data in the front would be a true subset of that in the back cache it would not result in any inconsistencies or in any way work differently than if the same data was loaded normally - just loaded in shorter time than first loading the back and then in parallel load all the clients front caches by reading the known "most used keys" from the back tier.

I was simply interested to know if there where any not to complicated way to give it a try and see if it was worthwhile but after looking a bit at the source it seems like it would not be trivial (perhaps even require a custom cache implementation) so probably not anything I will consider further at this time.

On Thu, Jun 13, 2024, 18:26 Aleks Seovic @.***> wrote:

This would default the purpose of the near cache, which is to invalidate and evict entries in the front map whenever they change in the backing map. With this in mind, what would having an entry in the front map that doesn't exist in the backing map even mean!?

If you want a "front/local map" on the client that is disconnected from a back cache, your only option is to use local-cache directly.

— Reply to this email directly, view it on GitHub https://github.com/oracle/coherence/issues/129#issuecomment-2166161170, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXQFZCF5YNT3NJBKPKALLZHHB4LAVCNFSM6AAAAABJHXR7ZGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRWGE3DCMJXGA . You are receiving this because you authored the thread.Message ID: @.***>