On cluster join, simulation erroneously simulates owner as being local

guusdk commented 2 years ago

For a limited subset of data that is stored in (clustered) caches, ClusterListener keeps track of what cluster node added what data. It uses a Hazelcast EntryEventListener for this, which is generated for every entry modification. The implementation listens for those, grabs the NodeID for the originating node, the data of interest, and uses that to track what needs to be tracked.

When a node initially joins a cluster, it can be presented with a ClusteredCache that already contains data (added by remote nodes). In order to synchronize this preexisting data with the locally stored subset, EntryEvents are simulated for these existing entries. This allows the code that keeps track to simply process a bunch of "entry added" events, which should take care of things.

My concern here is that in line 175, the local cluster node is used as the 'source' of the event. I believe that this leads to a problem, as it results in the local node being regarded as the node that added the data, instead of the node that originally added the data. This is a problem, as that nodeId is used later to determine what data needs to be processed if a particular cluster node becomes unavailable.

(as an aside: this all assumes that only one node will only modify data, and all other nodes will only read data. That's probably true for a lot of caches, but far from a guarantee)

guusdk commented 2 years ago

Assuming that I'm right that this indeed is a problem (this is typically where @GregDThomas points out why I'm wrong), I'm seeing at least a couple of paths to fix this:

Record the 'owner' of the data as part of the value, and iterate over that.
Have each 'remote' cluster node send a cluster event to a cluster node that is sees joining the cluster, containing the data that it owns. Each cluster node already needs to maintain this data, as that is needed to 'restore the cache' whenever we flip from DefaultCache to ClusteredCache or vice versa.

GregDThomas commented 2 years ago

I think you may have misinterpreted what's happening.

Or I have, probably more likely. Not quite sure what is happening. At what point does the HashMap<>() get swapped for an IMap<>()?

By my reading, I think what is happening is; a) This node has joined cluster - joinCluster() is invoked b) CacheListeners are added to the caches c) CacheInsert events are raised for all existing entries in the Cache - which at this point are local only, so are owned by the local node d) The fireJoinedCluster() event is raised - which I think swaps out the HashMap<> Caches for IMap<> caches. e) Only now do the caches hold entries for all cluster nodes.

Maybe.

It's been a while!

guusdk commented 2 years ago

I'm grateful that you're at least trying, even if it has been a while. :) I very much appreciate your feedback.

At what point does the HashMap<>() get swapped for an IMap<>()?

I'm not sure what you're referring to here.

Are you certain of what you say under item C? Specifically, I think that at this point, the caches are explicitly no longer local - they have already been switched to the ClusteredCache (and thus, contain data that is added by other cluster nodes). Given that on line 171 the cache is cast to ClusteredCache doesn't leave much room for anything else, I think.

The purpose of the entryLIstener is to record what is happening on all other cluster nodes, not what's happening on the local node. This is to ensure that when a cluster unexpectedly leaves the cluster, we can use that recorded data to 'clean up'. From what I've seen so far, the data structures that are maintained through entryListeners do not even include data that's "owned" by the local node. That is why I triggered when seeing the local node to be used as the source of these events. These simulated events are only being processed locally.

GregDThomas commented 2 years ago

No, not at all sure. I trust your judgement!

re. solution number 1, my concern is that every cache entry must be of the same shape so that the real owner can be found. re. solution number 2, my concern is that you're essentially doubling the traffic across the network - once from Hazelcast to get the cache entries, once from each of the owners with the data they own.

Perhaps a variation on number 2; each node simply sends a single message (for each Cache) containing the keys in the cache that it owns. The new node can then fetch the values from the Cache, and simulate the entry for the correct owner.

guusdk commented 2 years ago

I think that we're largely in agreement.

As for your concern for solution number 1: that is only a problem if the entries are maintained in a singular data structure (for all caches), as is (somewhat) the case in the Hazelcast plugin. When we move this to Openfire (#50), then we it is only natural to have a data structure 'per use-case' - where every use case can have it's own shape.

As for two: somewhat, yes. The secondary broadcasts would be a lot lighter (only including ownership and minimal data), but it's not ideal. I'd use this only if we can't / won't record "node ownership" of a particular data type (which might not always be relevant to the datatype).

@evdherberg and me are working on #50 as-we-type. As soon as we push, you'll see updates on his branch, here: https://github.com/evdherberg/Openfire/tree/test-cluster-refactor - So far, we haven't needed option 2 yet, so that's good. This is very much a work in progress (and some refactoring can be expected), but I'm interested in any feedback that you already have with the approach we've taken. (the corresponding changes in the Hazelcast plugin: https://github.com/evdherberg/openfire-hazelcast-plugin/tree/improve-clustering)

GregDThomas commented 2 years ago

Love seeing all the code being removed from the plugin! OK, so it's being moved, but hopefully the new code will be better.

(glad to see the plugin is getting some love, at least, I've not touched it for what seems like forever)

GregDThomas commented 2 years ago

As a thought, is it worth doing a minor release of the HZ plug-in now, with no changes other than the max server version in the plugin.xml? (I forget the exact propertY name) The current version of the plugin won’t work on the next major version of Openfire, and would be good to head people off at the pass from installing it.

igniterealtime / openfire-hazelcast-plugin

On cluster join, simulation erroneously simulates owner as being local #62