Closed brianoliver closed 9 years ago
@brianoliver said: The name of an event channel or distributor is not hard coded, but set through configuration.
For example: In the Push Replication Tests we use the following:
<event:distributor-external-name>{site-name}-{cluster-name}-{cache-name}</event:distributor-external-name>
Notice here that we're using the "site-name" system parameter. If this is not set it will be resolved to
Consequently this issue is due to the "site-name" parameter being used in a configuration file, but not being set as a system parameter. If the system parameter is not desired, the "site-name" parameter can be removed.
@brianoliver said: This is due to a configuration mistake.
This issue was imported from JIRA COHINC-148
Reported by agirona
Marked as works as designed by @brianoliver on Tuesday, October 13th 2015, 11:16:54 am
The architecture is Active-Active Push Replication 11.3.0 + Coherence 3.7.1.13 . In this environment there is a difference with the usual Push Rep architecture because the two clusters extend across two different sites (they are physically close),i.e. the network latency is very low and we can afford having Cluster A extending across site1 and site2 and Cluster B also extending across site1 and site2. In the test performed we are replicating 5 different caches between Cluster A and Cluster B.
The setup works fine when there is an event channel created for each cache / site pair so it ends up having event channels from Cluster A - site1 to Cluster B and event channels from Cluster A - site2 to Cluster B. Upon performing several restarts (stopping and starting cache server nodes on site1 and site2) we end up just having event channels from Cluster A - site1 to cluster B but we don't have any event channel in Cluster A - site2 to Cluster B and, thus, just data having its primary in the Cluster A - site1 node is replicated to Cluster B, as nobody is listening to events in Cluster A - site2 (as our statusHA is SITE-SAFE the primary data is always distributed across sites)
We have tested unsetting tangosol.coherence.site and setting tangosol.coherence.rack and this way it works fine, i.e. the channels are always created correctly so all the data is replicated and no data is lost.
When checking the channel name with JConsole looks like the trick is when tangosol.coherence.site is not set the channel name is defined asmycluster-mycache insted of mysite-mycluster-mycache .