oracle / coherence

Oracle Coherence Community Edition
https://coherence.community
Universal Permissive License v1.0
427 stars 70 forks source link

How to configure the same partitioned cache to be a back cache both for use "in cluster" and with gRPC? #126

Closed javafanboy closed 3 months ago

javafanboy commented 3 months ago

I am using Coherence 24.03 and has started to look at using the gRPC prox to allow "dynamically connected clients" (AWS Lambda functions) that be members of a Coherence cluster and client written in other languages than Java.

As we already have a large "near cache" (local front and partitioned back) that is central to our system I would the new clients to also use it rather than create a new cache for use with gRPC.

I always struggle with how to write the XML cache configuration for my systems and this is no exception - can anybody please give me some hints if this is possible and if so how I could go about it?

thegridman commented 3 months ago

@javafanboy Do you mean you want to use a NearCache on a Java gRPC client? This should just work as you define things in the client cache config like normal, but instead of an Extend remote-cache-scheme you define a remote-grpc-cache-scheme as documented here Defining a Remote gRPC Cache

javafanboy commented 3 months ago

I mean that I already have a "near cache" configuration that I use to efficiently share a lot of data with Java clients that are "regular members" of the Coherence cluster.

Now I would also some other clients, my first and most important case AWS Lambda functions written in Java (that since Lambda functions are suspended when not called cant maintain cluster membership), to both put and get objects to THE SAME "near cache" allowoing them to interact with the existing "regular members".

Ideally I would like them to also have a "near" cache tier (but I am not sure how invalidation would work for this given the "intermittent life" of the LAmbda functions) but even just access to the back tier of the near cache with no "front tier" would be a step in the right direction and potentially as efficient as other popular caching alternatives in the cloud that only provide access over network....

My question is basically what additional configuration I would need to add in the cache config file to make this cache (or at least its back tier) reachable over gRPC from a Java client not part of the cluster?

On Fri, May 24, 2024 at 3:55 PM Jonathan Knight @.***> wrote:

@javafanboy https://github.com/javafanboy Do you mean you want to use a NearCache on a Java gRPC client? This should just work as you define things in the client cache config like normal, but instead of an Extend remote-cache-scheme you define a remote-grpc-cache-scheme as documented here Defining a Remote gRPC Cache https://docs.oracle.com/en/middleware/standalone/coherence/14.1.1.2206/develop-remote-clients/using-coherence-java-grpc-client.html#GUID-8E1B2F97-9784-4204-9B00-92B3222DE043

— Reply to this email directly, view it on GitHub https://github.com/oracle/coherence/issues/126#issuecomment-2129601588, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXQF64PJ74JNAW3FIU5JTZD5BELAVCNFSM6AAAAABIHT5M5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZGYYDCNJYHA . You are receiving this because you were mentioned.Message ID: @.***>

thegridman commented 3 months ago

OK, if I understand correctly you have a NearCache on the cluster member that is running the proxy that the client connects to. In that case the NearCache will not be used by clients. I am pretty sure both Extend clients and gRPC clients will bypass any NearCache on the proxy and go straight to the back cache.

javafanboy commented 3 months ago

That would be just fine - my most heavy processing will still be in containers or VMs that are regular cluster members.

What additional cache config would be required to make that happen? I have already got the gRPC proxy running but I fail to see what I would need to add in order to make the "near cache" also usable with it. The examples in the documentation assumes I create a new cache configuration for gRPC....

Also what would give best performance today extend or the gRPC proxy (as I understand it these are two different but "competing" solutions - seemingly gRPC is the one that seems to have most future potential but if I only use Java Extend is more well proven as of now)?

On Fri, May 24, 2024 at 4:24 PM Jonathan Knight @.***> wrote:

OK, if I understand correctly you have a NearCache on the cluster member that is running the proxy that the client connects to. In that case the NearCache will not be used by clients. I am pretty sure both Extend clients and gRPC clients will bypass any NearCache on the proxy and go straight to the back cache.

— Reply to this email directly, view it on GitHub https://github.com/oracle/coherence/issues/126#issuecomment-2129657280, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXQF5BXRA3CXIFBJKOSWTZD5ETLAVCNFSM6AAAAABIHT5M5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMRZGY2TOMRYGA . You are receiving this because you were mentioned.Message ID: @.***>

javafanboy commented 3 months ago

To make my question more clear I created two minimal test program, one that joins the Coherence cluster and inserts a value in a near cache and another that tries to extract the same value but does not join the coherence cluster and instead tries to use the gRPC client. I also included the simple cache config I use. In this file I have so far only included what the documentation stated is the "minimal configuration" for gRPC. I can confirm that when I launch the separate default cache server it reports starting the gRPC proxy, the first program run as expected but as of now when I run the second program (everything run locally on my laptop) I get an exception because com.tangosol.net.Coherence.getInstance() is null. I assume this is because I need to add something more to the gRPC configuration but I do not know what :-( All sugestions are warmly appreciated!


package com.test; import com.tangosol.net.*; public class MicroCoherenceTest {

static {
    System.setProperty("tangosol.coherence.wka", "127.0.0.1");
    System.setProperty("tangosol.pof.enabled", "true");
    System.setProperty("tangosol.pof.config", "custom-pof-config.xml");
    System.setProperty("tangosol.coherence.cacheconfig", "custom-cache-config.xml");
    System.setProperty("tangosol.coherence.distributed.localstorage", "false");
}

static final NamedCache<String, String> testCache =
        CacheFactory.getCache("near-test");

public static void main(String[] args) {
    testCache.put("hello", "world");
    System.out.println(testCache.get("hello"));
}

}


package com.test;

import com.tangosol.net.*;

public class MicroCoherenceRPCTest {

static {
    System.setProperty("tangosol.coherence.wka", "127.0.0.1");
    System.setProperty("tangosol.pof.enabled", "true");
    System.setProperty("tangosol.pof.config", "custom-pof-config.xml");
    System.setProperty("tangosol.coherence.cacheconfig", "custom-cache-config.xml");
    System.setProperty("tangosol.coherence.distributed.localstorage", "false");
    System.setProperty("coherence.tcmp.enabled", "false");
}

static final Session session = Coherence.getInstance().getSession();
static final NamedMap<String, String> testCache =
        session.getMap("near-test");

public static void main(String[] args) {
    // After inserting a value by running MicroCoherenceTest I want to be
    // able to retrieve it with this program!
    System.out.println(testCache.get("hello"));
}

}

<?xml version="1.0"?>

<cache-config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="http://xmlns.oracle.com/coherence/coherence-cache-config" xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-cache-config coherence-cache-config.xsd" xml-override="{coherence.cacheconfig.override}">

${coherence.scope} ${coherence.serializer} near-* ${coherence.profile near}-${coherence.client direct} * remote-grpc * topic-server near-direct {front-limit-entries 10000} thin-direct 0 thin-direct server false false server ${coherence.service.name PartitionedCache} true 31 true {back-limit-bytes 0B} true topic-server ${coherence.service.name Partitioned}Topic true 9 true {topic-high-units-bytes 0B} my-invocation-service InvocationService true remote-grpc RemoteGrpcCache

thegridman commented 3 months ago

There seems to be a few things here.

  1. Calling com.tangosol.net.Coherence.getInstance() on the server will return null if the server has not been started using the Coherence bootstrap API (see the docs here) If Coherence is not started correctly there will be no Coherence instance to return. The bootstrap API can also be used to start a client. If you do not start the server this way then there will be no gRPC proxy started for the client to connect to.

The simplest way to get a session at start-up on a cluster member is

Coherence coherence = Coherence.clusterMember().start().get();
Session session = coherence.getSession();
  1. On the client, your config looks ok. All caches map to the "remote-grpc" scheme. The way that is configured, it will use the Coherence NameService to look up the gRPC proxy endpoints to connect to, the same way Extend does. The paged-topic-scheme in that config will not work on a client, currently topics only work on a cluster member.
javafanboy commented 3 months ago

Thanks for the quick reply "gridman"!

Updating the gRPC test as follows:

package com.test;

import com.tangosol.net.*;

import java.util.concurrent.ExecutionException;

public class MicroCoherenceRPCTest {

    static {
        System.setProperty("tangosol.coherence.wka", "127.0.0.1");
        System.setProperty("tangosol.pof.enabled", "true");
        System.setProperty("tangosol.pof.config", "custom-pof-config.xml");
        System.setProperty("tangosol.coherence.cacheconfig", "custom-cache-config.xml");
        System.setProperty("tangosol.coherence.distributed.localstorage", "false");
        System.setProperty("coherence.tcmp.enabled", "false");
    }

    public static void main(String[] args) {
        final Coherence coherence;
        try {
            coherence = Coherence.clusterMember().start().get();
        } catch (InterruptedException | ExecutionException e) {
            throw new RuntimeException(e);
        }
        final Session session = coherence.getSession();
        final NamedMap<String, String> testCache =
                session.getMap("near-test");
        // After inserting a value by running MicroCoherenceTest I want to be
        // able to retrieve it with this program!
        System.out.println(testCache.get("hello"));
    }
}

results in the following exception: 2024-05-27 14:37:27.090/0.627 Oracle Coherence CE 24.03 (thread=Coherence, member=n/a): Error while starting cluster: java.lang.UnsupportedOperationException: TCMP clustering has been disabled; this configuration may only access clustered services via Extend proxies. at com.tangosol.coherence.component.net.Cluster.onStart(Cluster.java:2864) at com.tangosol.coherence.component.net.Cluster.start(Cluster.java:3425) at com.tangosol.coherence.component.util.SafeCluster.startCluster(SafeCluster.java:1611) at com.tangosol.coherence.component.util.SafeCluster.restartCluster(SafeCluster.java:1260) at com.tangosol.coherence.component.util.SafeCluster.ensureRunningCluster(SafeCluster.java:619) at com.tangosol.coherence.component.util.SafeCluster.getRunningCluster(SafeCluster.java:962) at com.tangosol.coherence.component.util.SafeCluster.start(SafeCluster.java:1596) at com.tangosol.net.CacheFactory.ensureCluster(CacheFactory.java:592) at com.tangosol.net.ExtensibleConfigurableCacheFactory.ensureService(ExtensibleConfigurableCacheFactory.java:771) at com.tangosol.net.ExtensibleConfigurableCacheFactory.startServices(ExtensibleConfigurableCacheFactory.java:873) at com.tangosol.net.ExtensibleConfigurableCacheFactory.activate(ExtensibleConfigurableCacheFactory.java:590) at com.tangosol.net.Coherence.startSystemCCF(Coherence.java:1811) at com.tangosol.net.Coherence.createSystemSession(Coherence.java:1932) at com.tangosol.net.Coherence.initializeSystemSession(Coherence.java:1892) at com.tangosol.net.Coherence.startInternal(Coherence.java:1562) at com.tangosol.net.Coherence.lambda$start$6(Coherence.java:1228) at java.base/java.lang.Thread.run(Thread.java:1583)

and later when I try to retrieve the session Exception in thread "main" java.lang.RuntimeException: java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: TCMP clustering has been disabled; this configuration may only access clustered services via Extend proxies. at com.test.MicroCoherenceRPCTest.main(MicroCoherenceRPCTest.java:27) Caused by: java.util.concurrent.ExecutionException: java.lang.UnsupportedOperationException: TCMP clustering has been disabled; this configuration may only access clustered services via Extend proxies. at java.base/java.util.concurrent.CompletableFuture.reportGet(CompletableFuture.java:396) at java.base/java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2073) at com.test.MicroCoherenceRPCTest.main(MicroCoherenceRPCTest.java:25) Caused by: java.lang.UnsupportedOperationException: TCMP clustering has been disabled; this configuration may only access clustered services via Extend proxies. at com.tangosol.coherence.component.net.Cluster.onStart(Cluster.java:2864) at com.tangosol.coherence.component.net.Cluster.start(Cluster.java:3425) at com.tangosol.coherence.component.util.SafeCluster.startCluster(SafeCluster.java:1611) at com.tangosol.coherence.component.util.SafeCluster.restartCluster(SafeCluster.java:1260) at com.tangosol.coherence.component.util.SafeCluster.ensureRunningCluster(SafeCluster.java:619) at com.tangosol.coherence.component.util.SafeCluster.getRunningCluster(SafeCluster.java:962) at com.tangosol.coherence.component.util.SafeCluster.start(SafeCluster.java:1596) at com.tangosol.net.CacheFactory.ensureCluster(CacheFactory.java:592) at com.tangosol.net.ExtensibleConfigurableCacheFactory.ensureService(ExtensibleConfigurableCacheFactory.java:771) at com.tangosol.net.ExtensibleConfigurableCacheFactory.startServices(ExtensibleConfigurableCacheFactory.java:873) at com.tangosol.net.ExtensibleConfigurableCacheFactory.activate(ExtensibleConfigurableCacheFactory.java:590) at com.tangosol.net.Coherence.startSystemCCF(Coherence.java:1811) at com.tangosol.net.Coherence.createSystemSession(Coherence.java:1932) at com.tangosol.net.Coherence.initializeSystemSession(Coherence.java:1892) at com.tangosol.net.Coherence.startInternal(Coherence.java:1562) at com.tangosol.net.Coherence.lambda$start$6(Coherence.java:1228) at java.base/java.lang.Thread.run(Thread.java:1583)

If I remove the line disabling tcmp the program runs but I suspects it then joing the cluster normally rather than use gRPC client/proxy?!

thegridman commented 3 months ago

This is because your code is doing this:

coherence = Coherence.clusterMember().start().get();

which as the method name suggests is going to start Coherence as a cluster member, but you have this property set

System.setProperty("coherence.tcmp.enabled", "false");

which will disable clustering.

If you want this code to start a client then you need to do Coherence.client().start()

javafanboy commented 3 months ago

Sorry but now slightly confused here - if I do NOT disable tcmp the client joins the cluster (I can see log messages that I interpret like it jons the cache service etc).

So how do I initiate the client so that it will NOT join the cluster but still can access caches using gRPC client/proxy?

Do I then need to used another method to find the gRPC proxy (like the fixed IP)?

thegridman commented 3 months ago

If you are starting a client with Coherence.client().start() then you should be able to use System.setProperty("coherence.tcmp.enabled", "false"); But if your client cache config is the one you posted above, then it has services defined with <autostart>true</autostart> that are clustered services (i.e. distributed-scheme and paged-topic-scheme) and will attempt to start the cluster. You either need a client config that only has the client remote-grpc-scheme or do not autostart the clustered services

javafanboy commented 3 months ago

Thanks will create a separate config for the "loosely connected" clients!

On Mon, May 27, 2024, 17:37 Jonathan Knight @.***> wrote:

If you are starting a client with Coherence.client().start() then you should be able to use System.setProperty("coherence.tcmp.enabled", "false"); But if your client cache config is the one you posted above, then it has services defined with true that are clustered services (i.e. distributed-scheme and paged-topic-scheme) and will attempt to start the cluster. You either need a client config that only has the client remote-grpc-scheme or do not autostart the clustered services

— Reply to this email directly, view it on GitHub https://github.com/oracle/coherence/issues/126#issuecomment-2133712032, or unsubscribe https://github.com/notifications/unsubscribe-auth/AADXQFZXUXPJJFPSM3VDACLZENHLRAVCNFSM6AAAAABIHT5M5CVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCMZTG4YTEMBTGI . You are receiving this because you were mentioned.Message ID: @.***>

javafanboy commented 3 months ago

Latest try I simplified the program to this (I assume I still need the wka for finding the grpc proxy etc):

package com.test;
import com.tangosol.net.Coherence;
import com.tangosol.net.NamedMap;
import com.tangosol.net.Session;
import java.util.concurrent.ExecutionException;
public class MicroCoherenceRPCTest {
    static {
        System.setProperty("tangosol.coherence.wka", "127.0.0.1");
        System.setProperty("tangosol.pof.enabled", "true");
        System.setProperty("tangosol.pof.config", "custom-pof-config.xml");
        System.setProperty("tangosol.coherence.cacheconfig", "grpc-custom-cache-config.xml");
        System.setProperty("coherence.tcmp.enabled", "false");
    }
    public static void main(String[] args) throws ExecutionException, InterruptedException {
        final Coherence coherence = Coherence.clusterMember().start().get();
        final Session session = coherence.getSession();
        NamedMap<String, String> testCache = session.getMap("near-scania");
        System.out.println(testCache.get("hello"));
    }
}

and created a separate grpc-custom-cache-config.xml that looks like this:

<?xml version="1.0"?>
<cache-config xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
              xmlns="http://xmlns.oracle.com/coherence/coherence-cache-config"
              xsi:schemaLocation="http://xmlns.oracle.com/coherence/coherence-cache-config coherence-cache-config.xsd"
              xml-override="{coherence.cacheconfig.override}">
  <defaults>
    <scope-name>${coherence.scope}</scope-name>
    <serializer>${coherence.serializer}</serializer>
  </defaults>

  <caching-scheme-mapping>
    <cache-mapping>
      <cache-name>*</cache-name>
      <scheme-name>remote-grpc</scheme-name>
    </cache-mapping>
  </caching-scheme-mapping>

  <caching-schemes>
    <remote-grpc-cache-scheme>
      <scheme-name>remote-grpc</scheme-name>
      <service-name>RemoteGrpcCache</service-name>
    </remote-grpc-cache-scheme>
  </caching-schemes>
</cache-config>

but still get the exception

2024-05-27 19:47:40.076/0.623 Oracle Coherence CE 24.03 <Error> (thread=Coherence, member=n/a): java.lang.UnsupportedOperationException: TCMP clustering has been disabled; this configuration may only access clustered services via Extend proxies.
    at com.tangosol.coherence.component.net.Cluster.onStart(Cluster.java:2864)
    at com.tangosol.coherence.component.net.Cluster.start(Cluster.java:3425)
    at com.tangosol.coherence.component.util.SafeCluster.startCluster(SafeCluster.java:1611)
    at com.tangosol.coherence.component.util.SafeCluster.restartCluster(SafeCluster.java:1260)
    at com.tangosol.coherence.component.util.SafeCluster.ensureRunningCluster(SafeCluster.java:619)
    at com.tangosol.coherence.component.util.SafeCluster.getRunningCluster(SafeCluster.java:962)
    at com.tangosol.coherence.component.util.SafeCluster.start(SafeCluster.java:1596)
    at com.tangosol.net.CacheFactory.ensureCluster(CacheFactory.java:592)
    at com.tangosol.net.ExtensibleConfigurableCacheFactory.ensureService(ExtensibleConfigurableCacheFactory.java:771)
    at com.tangosol.net.ExtensibleConfigurableCacheFactory.startServices(ExtensibleConfigurableCacheFactory.java:873)
    at com.tangosol.net.ExtensibleConfigurableCacheFactory.activate(ExtensibleConfigurableCacheFactory.java:590)
    at com.tangosol.net.Coherence.startSystemCCF(Coherence.java:1811)
    at com.tangosol.net.Coherence.createSystemSession(Coherence.java:1932)
    at com.tangosol.net.Coherence.initializeSystemSession(Coherence.java:1892)
    at com.tangosol.net.Coherence.startInternal(Coherence.java:1562)
    at com.tangosol.net.Coherence.lambda$start$6(Coherence.java:1228)
    at java.base/java.lang.Thread.run(Thread.java:1583)

Are these things documented somewhere - I tried reading about gRPC but did not find very detailed information about the start of the client etc.

thegridman commented 3 months ago

In your code you are still starting Coherence as a cluster member

final Coherence coherence = Coherence.clusterMember().start().get();

This means Coherence will start various "system" services which require the cluster. You need to start Coherence as a client

final Coherence coherence = Coherence.client().start().get();
javafanboy commented 3 months ago

Thanks very much - now it works!!!!