WrapperBinaryCacheStoreTests fails to stop

GoogleCodeExporter commented 9 years ago

What steps will reproduce the problem?
-------------------------------------

1. Set the property coherence.incubator.cluster.0.count=4 in 
common-cluster.properties
2. mvn -Dtest=WrapperBinaryCacheStoreTest test

What is the expected output? What do you see instead?
-----------------------------------------------

For values of coherence.incubator.cluster.0.count less than 4, e.g. 3, the test 
succeeds. With 4 or more nodes, the clusters fail to shut down properly, 
entering an infinite loop:

2011-11-07 11:20:13.907/13.282 Oracle Coherence GE 3.7.0.2 <D5> (thread=main, 
member=n/a): Waiting for cluster to stop SafeCluster: Name=gridman-test

Group{Address=224.3.7.0, Port=8016, TTL=0}

MasterMemberSet
  (
  ThisMember=Member(Id=6, Timestamp=2011-11-07 11:20:10.916, Address=192.168.70.24:8088, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
  OldestMember=Member(Id=2, Timestamp=2011-11-07 11:20:04.316, Address=192.168.70.24:8090, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
  ActualMemberSet=MemberSet(Size=5, BitSetCount=2
    Member(Id=2, Timestamp=2011-11-07 11:20:04.316, Address=192.168.70.24:8090, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    Member(Id=3, Timestamp=2011-11-07 11:20:04.333, Address=192.168.70.24:8092, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    Member(Id=4, Timestamp=2011-11-07 11:20:05.709, Address=192.168.70.24:8094, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    Member(Id=5, Timestamp=2011-11-07 11:20:07.401, Address=192.168.70.24:8096, MachineId=11800, Location=process:31388, Role=GridManExtendProxyNode)
    Member(Id=6, Timestamp=2011-11-07 11:20:10.916, Address=192.168.70.24:8088, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    )
  RecycleMillis=1200000
  RecycleSet=MemberSet(Size=1, BitSetCount=2
    Member(Id=7, Timestamp=2011-11-07 11:20:12.184, Address=192.168.70.24:8098, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    )
  )

TcpRing{Connections=[4, 5]}
IpMonitor{AddressListSize=0}

2011-11-07 11:20:14.008/13.383 Oracle Coherence GE 3.7.0.2 <D5> (thread=main, 
member=n/a): Waiting for cluster to stop SafeCluster: Name=gridman-test

Group{Address=224.3.7.0, Port=8016, TTL=0}

MasterMemberSet
  (
  ThisMember=Member(Id=6, Timestamp=2011-11-07 11:20:10.916, Address=192.168.70.24:8088, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
  OldestMember=Member(Id=2, Timestamp=2011-11-07 11:20:04.316, Address=192.168.70.24:8090, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
  ActualMemberSet=MemberSet(Size=5, BitSetCount=2
    Member(Id=2, Timestamp=2011-11-07 11:20:04.316, Address=192.168.70.24:8090, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    Member(Id=3, Timestamp=2011-11-07 11:20:04.333, Address=192.168.70.24:8092, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    Member(Id=4, Timestamp=2011-11-07 11:20:05.709, Address=192.168.70.24:8094, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    Member(Id=5, Timestamp=2011-11-07 11:20:07.401, Address=192.168.70.24:8096, MachineId=11800, Location=process:31388, Role=GridManExtendProxyNode)
    Member(Id=6, Timestamp=2011-11-07 11:20:10.916, Address=192.168.70.24:8088, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    )
  RecycleMillis=1200000
  RecycleSet=MemberSet(Size=1, BitSetCount=2
    Member(Id=7, Timestamp=2011-11-07 11:20:12.184, Address=192.168.70.24:8098, MachineId=11800, Location=process:31388, Role=GridManStorageNode)
    )
  )

... and so forth ad infinitum.

What version of the product are you using? On what operating system?
-------------------------------------------------------------

I am using Gridman 1.1.0, along with Coherence 3.7.0-patch2, on OS X Snow 
Leopard.

Thank You for building up such a fantastic test library.
  Rubén

Original issue reported on code.google.com by ruben...@gmail.com on 7 Nov 2011 at 10:25

GoogleCodeExporter commented 9 years ago

Also, the following exception crops up during the execution of that test:

2011-11-07 11:20:07.464/6.839 Oracle Coherence GE 3.7.0.2 <Error> 
(thread=Invocation:Management, member=2): 
(Wrapped) javax.management.InstanceAlreadyExistsException: 
Coherence:type=Service,name=TestDistributedService,nodeId=4
    at com.tangosol.util.Base.ensureRuntimeException(Base.java:288)
    at com.tangosol.util.Base.ensureRuntimeException(Base.java:269)
    at com.tangosol.coherence.component.net.management.gateway.Local.registerModelMBean(Local.CDB:49)
    at com.tangosol.coherence.component.net.management.Connector.onRegister(Connector.CDB:43)
    at com.tangosol.coherence.component.net.management.Connector$Register.run(Connector.CDB:2)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.InvocationService.onInvocationMessage(InvocationService.CDB:6)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.grid.InvocationService$InvocationMessage.onReceived(InvocationService.CDB:32)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onMessage(Grid.CDB:33)
    at com.tangosol.coherence.component.util.daemon.queueProcessor.service.Grid.onNotify(Grid.CDB:33)
    at com.tangosol.coherence.component.util.Daemon.run(Daemon.CDB:42)
    at java.lang.Thread.run(Thread.java:637)
Caused by: javax.management.InstanceAlreadyExistsException: 
Coherence:type=Service,name=TestDistributedService,nodeId=4
    at com.sun.jmx.mbeanserver.Repository.addMBean(Repository.java:453)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.internal_addObject(DefaultMBeanServerInterceptor.java:1484)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerDynamicMBean(DefaultMBeanServerInterceptor.java:963)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerObject(DefaultMBeanServerInterceptor.java:917)
    at com.sun.jmx.interceptor.DefaultMBeanServerInterceptor.registerMBean(DefaultMBeanServerInterceptor.java:312)
    at com.sun.jmx.mbeanserver.JmxMBeanServer.registerMBean(JmxMBeanServer.java:482)
    at com.tangosol.coherence.component.net.management.gateway.Local.registerModelMBean(Local.CDB:23)
    ... 8 more

Original comment by ruben...@gmail.com on 7 Nov 2011 at 2:15

GoogleCodeExporter commented 9 years ago

I have replaced the original shutdown method with this one (hard stop) and now 
seems to be working:

            Class cls = classLoader.loadClass("com.tangosol.net.CacheFactory");

            Class cls2 = classLoader.loadClass("com.tangosol.coherence.component.util.SafeCluster");
            Method m1 = cls.getDeclaredMethod("getCluster");
            Method m2 = cls2.getDeclaredMethod("shutdown");
            Object invoke = m1.invoke(null);
            CacheFactory.log("CLASS: "+invoke.getClass().toString(), CacheFactory.LOG_INFO);
            m2.invoke(invoke);
            CacheFactory.log("getCluster().stop() called", CacheFactory.LOG_INFO);

Does it worth to add a new "stop()" method to CoherenceClassloaderLifecycle ?

Original comment by mikel.al...@gmail.com on 8 Nov 2011 at 5:09

GoogleCodeExporter commented 9 years ago

amendment: 
Method m2 = cls2.getDeclaredMethod("shutdown");
should be
Method m2 = cls2.getDeclaredMethod("stop");

Original comment by mikel.al...@gmail.com on 8 Nov 2011 at 5:41

wheresbarney / gridman

WrapperBinaryCacheStoreTests fails to stop #1