eclipse-ee4j / glassfish

Eclipse GlassFish
https://eclipse-ee4j.github.io/glassfish/
383 stars 143 forks source link

classloader errors on instance restart or deployment after a cluster restart #17971

Closed glassfishrobot closed 12 years ago

glassfishrobot commented 12 years ago

3.1.2 nightly build b14-12_09_2011

The intermittent failures were observed in tx recovery tests of jdbc/jms app with dblog while verifying the fix for #17789. When a recovery test failed, db lock was not released and caused the failure of later tests. Since the intermittent failures apear often and affected the test execution. File a p2 bug to track issue. I have sent test details in emails to dev and will add related part to this bug.

Environment

RHL 5 and JDK 1.6.0_24

Affected Versions

[3.1.2_dev]

glassfishrobot commented 6 years ago
glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: auto-delegated recovery of jdbc/jms app with dblog and test source in appserver-sqe/pe/transaction/recovery/cliapp3

This 1 test is compared w.r.t. different builds from my observation. 1) promoted build 10, the test passed consistently. 2) nightly build 11 with the fix for self recovery, e.g. b11-11_21_2011, the test failed consistently. filed #17789 and verified multiple fixes on later build. 3) nightly build 14 with the fix for #17789, e.g. b14-12_09_2011 the test passed in some runs, and failed in other runs. filed this bug, #17971

The examples of passed and failed cases on 14-12_09_2011 are given below. Run A. passed case. http://sqe-hudson.us.oracle.com:8080/hudson/view/Sherry_core/job/sherry-tx-lc-dbg/ Good news is that all tx recovery tests with db log in this run.

Run B. failed case I stopped the hudson job manually after I saw the failure so that log can be viewed easily. Run B and Run A are in the same order of execution as below --cliweb pass --cliapp3 fail --rest tests

http://sqe-hudson.us.oracle.com:8080/hudson/job/sherry-tx/ws/glassfish3/glassfish/nodes/localhost-domain1/clustered_instance_2/logs/server.log/view/

[#|2011-12-09T19:31:36.969-0800|INFO|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.inbound|_ThreadID=17;_ThreadName=Thread-2;| Recovery of Inbound Transactions started.|#]

[#|2011-12-09T19:31:37.273-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=17;_ThreadName=Thread-2;| RAR8061: failed to load resource-adapter-config or RA [ ds_jdbc_ra ], com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: No Classloader found for RA [ ds_jdbc_ra ]|#]

[#|2011-12-09T19:31:37.274-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=17;_ThreadName=Thread-2;| RAR8060: Unable to lookup pool [ poolNontx ], javax.naming.NamingException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv=

{java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming} [Root exception is javax.naming.NameNotFoundException: pools]|#]

[#|2011-12-09T19:31:37.274-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=17;_ThreadName=Thread-2;|RAR6017 : Failed to get connection pool object jdbc/nontx via JNDI lookup : com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming}

|#]

[#|2011-12-09T19:31:37.275-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.resource.recovery|_ThreadID=17;_ThreadName=Thread-2;| RAR7109: Error while loading jdbc resources during recovery : jdbc/nontx|#]

[#|2011-12-09T19:31:37.279-0800|INFO|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=17;_ThreadName=Thread-2;| JTS5014: Recoverable JTS instance, serverId = [100]|#]

[#|2011-12-09T19:31:37.317-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=18;_ThreadName=Thread-2;| RAR8061: failed to load resource-adapter-config or RA [ ds_jdbc_ra ], com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: No Classloader found for RA [ ds_jdbc_ra ]|#]

[#|2011-12-09T19:31:37.318-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=18;_ThreadName=Thread-2;|RAR8060: Unable to lookup pool [ poolNontx ], javax.naming.NamingException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv=

{java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming} [Root exception is javax.naming.NameNotFoundException: pools]|#]

[#|2011-12-09T19:31:37.318-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=18;_ThreadName=Thread-2;|RAR6017 : Failed to get connection pool object jdbc/nontx via JNDI lookup : com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming}

|#]

[#|2011-12-09T19:31:37.318-0800|SEVERE|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=18;_ThreadName=Thread-2;| JTS5070: Exception occurred while obtaining JDBC resource [jdbc/nontx] for transaction log.|#]

[#|2011-12-09T19:31:37.319-0800|SEVERE|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=18;_ThreadName=Thread-2;| javax.naming.NamingException: Lookup failed for 'jdbc/nontx' in SerialContext[myEnv=

{java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming} [Root exception is javax.naming.NamingException: Unable to lookup resource : jdbc/nontx [Root exception is com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming}

]] at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:518) at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:455) at javax.naming.InitialContext.lookup(InitialContext.java:392) at javax.naming.InitialContext.lookup(InitialContext.java:392) at com.sun.jts.CosTransactions.LogDBHelper.(LogDBHelper.java:101) at com.sun.jts.CosTransactions.LogDBHelper.(LogDBHelper.java:88) at com.sun.jts.CosTransactions.RecoveryManager.dbXARecovery(RecoveryManager.java:1180) at com.sun.jts.CosTransactions.ResyncThread.run(RecoveryManager.java:1897)

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Another examples captured from previous email show that the intermittent failure also on self-recovery of jdbc/jms app in earlier tests.

Run A: http://sqe-hudson.us.oracle.com:8080/hudson/job/sherry-tx/22/artifact/appserver-sqe/reports/pe-ee/i386_apg-sqe2_Linux/html/test_results_txn.html

22 b13-12_04_2011 10 tests in the following order

1) delegated and self recovery with jdbc app. (4 tests), pass 2) self recovery with jdbc/jms app (4 tests), fail 3) delegated recovery with jdbc/jms app (2 tests), fail due to #17789. I put failure tests at last since failures made db in bad condition.

Run B: new failure in self recovery and proved to be intermittent from other runs. http://sqe-hudson.us.oracle.com:8080/hudson/job/sherry-tx/23/artifact/appserver-sqe/reports/pe-ee/i386_apg-sqe1_Linux/html/test_results_txn.html

23 re-continuous build 403 with the fix for #17789.

the same 10 tests and execution order and source as Run A. 1) delegated and self recovery with jdbc app. (4 tests), pass 2) self recovery with jdbc/jms app (4 tests), see new failure in comparing with b13-12_04_2011. 3) delegated recovery with jdbc/jms app (2 tests), possibly due to unreleased db lock in 2).

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Track the intermittent failure of another run here even though the cause may be different.

On b14 promoted build, intermittent failure is observed in self recovery of jdbc/jms app (cliapp2) with dblog.

verify-data: [echo] Executing test on port 38080 [java] WS HOME appserver-sqe [java] main: testSuiteID=tx-cliapp2- [java] main: testCase=verify_xa [java] invoking webclient at http://localhost:38080/cliapp2/verify?xa [java] Processing line: RESULTS:0 [java] FAILURE [java] did not get expectedResult=RESULTS:9 [java] Generating report at /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/test_results.xml [java] [java] [java] ----------------------------------------- [java] - tx-cliapp2-t3mql-dblog-killASInstanceAutomaticSelfRecovery: FAIL - [java] ----------------------------------------- [java] Total PASS: 0 [java] Total FAIL: 1 [java] Total DNR: 0 [java] -----------------------------------------

I did not find Classloader error in this run. The test results and server.log are saved on RunID 20. http://agni-1.us.oracle.com/JSPWiki/Wiki.jsp?page=V312CoreTest The hudson job for this run is at http://sqe-hudson.us.oracle.com:8080/hudson/view/Sherry_core/job/sherry-lc-oracle/7/

glassfishrobot commented 12 years ago

@glassfishrobot Commented mvatkina said: The test failed because instance2 was down on deploy and the insert didn't happen (see deploy output from the console log below). How can I match the timing of this test to other tests in instance2 log?

deploy-common-ee: [echo] So you are using Enterprise Version eh ? [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false deploy --force=false --precompilejsp=false --verify=false --retrieve /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build/ee/i386_apg-sqe2_Linux/cliapp2/archive --generatermistubs=false --availabilityenabled=false --asyncreplication=true --target sqe-cluster --keepreposdir=false --keepfailedstubs=false --isredeploy=false --logreportederrors=true /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build/ee/i386_apg-sqe2_Linux/cliapp2/archive/cliapp2App.ear [exec] Application deployed with name cliapp2App. [exec] WARNING: Command _deploy did not complete successfully on server instance clustered_instance_2: remote failure: Failed to load the application on instance clustered_instance_2. The application will not run properly. Please fix your application and redeploy. [exec] Exception while loading the app : EJB Container initialization error. Please see server.log for more details.

glassfishrobot commented 12 years ago

@glassfishrobot Commented mvatkina said: There is something wrong with the classloaders in 3.1.2 after many restarts and redeploys.

Sherry's hudson job fails because instance can't be started (intermittently):

[#|2011-12-09T19:31:37.273-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=17;_ThreadName=Thread-2;|RAR8061: failed to load resource-adapter-config or RA [ ds_jdbc_ra ], com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: No Classloader found for RA [ ds_jdbc_ra ]|#]

I see (deterministically) in transaction recovery devtests ClassNotFoundException: com.sun.corba.ee.impl.oa.rfm.ReferenceManagerConfigurator after several cluster restarts and app redeploys:

Caused by: java.lang.ClassNotFoundException: com.sun.corba.ee.impl.oa.rfm.ReferenceManagerConfigurator at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1509) at org.glassfish.web.loader.WebappClassLoader.loadClass(WebappClassLoader.java:1359) at com.sun.corba.ee.spi.orb.ORB$2.evaluate(ORB.java:880)

To reproduce the latter:

% cd v2/appserv-tests/devtests/transaction/ee/dblogs/mdb % ant all-local

glassfishrobot commented 12 years ago

@glassfishrobot Commented ss141213 said: I doubt very much this is a class loader issue. Does this test pass in 3.1.1? Did this test pass in earlier builds of 3.1.2? Where does the class com.sun.corba.ee.impl.oa.rfm.ReferenceManagerConfigurator get loaded from in environment where it worked?

glassfishrobot commented 12 years ago

@glassfishrobot Commented mvatkina said: This is a new test, and in 3.1.1 MDBs didn't start ORB (they don't need ORB, but JMS does, but that's subject of another issue).

The class is in glassfish-corba-orb.jar

glassfishrobot commented 12 years ago

@glassfishrobot Commented ss141213 said: No, this is not a classloader issue. Asking a class loader to load an arbitrary class can result in ClassNotFoundException. Returning to ejb team to do further evaluation. Based on what I hear from Marina, the likely place where a fix is needed is corba, but I will let Marina take the call. An important question to answer is if this a regression? In other words, does the test pass in any previous build (or version)?

glassfishrobot commented 12 years ago

@glassfishrobot Commented mvatkina said: It's by no means an EJB issue (though is triggered by the test that executes an app with MDBs).

The ORB class is in a private package:

glassfish-corba-orb.jar META-INF/MANIFEST.MF: Private-Package: <...>, com.sun.corba.ee.impl.oa.rfm, <...> ORB-Class-Provider: com.sun.corba.ee.impl.oa.rfm.ReferenceManagerConfi gurator, <...>

So the question is why is it being loaded by the WebappCL?

The full stack trace: [#|2011-12-20T21:30:29.428-0800|WARNING|glassfish3.1.2|javax.enterprise.resource.corba.ORBUtil|_ThreadID=18;_ThreadName=Thread-2;|IOP01210054: Error while attempting to load class com.sun.corba.ee.impl.oa.rfm.ReferenceManagerConfigurator org.omg.CORBA.BAD_OPERATION: WARNING: IOP01210054: Error while attempting to load class com.sun.corba.ee.impl.oa.rfm.ReferenceManagerConfigurator vmcid: OMG minor code: 54 completed: No at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at com.sun.corba.ee.spi.orbutil.logex.corba.CorbaExtension.makeException(CorbaExtension.java:248) at com.sun.corba.ee.spi.orbutil.logex.corba.CorbaExtension.makeException(CorbaExtension.java:95) at com.sun.corba.ee.spi.orbutil.logex.WrapperGenerator.handleFullLogging(WrapperGenerator.java:387) at com.sun.corba.ee.spi.orbutil.logex.WrapperGenerator.access$400(WrapperGenerator.java:107) at com.sun.corba.ee.spi.orbutil.logex.WrapperGenerator$2.invoke(WrapperGenerator.java:511) at com.sun.corba.ee.spi.orbutil.proxy.CompositeInvocationHandlerImpl.invoke(CompositeInvocationHandlerImpl.java:99) at $Proxy158.classActionException(Unknown Source) at com.sun.corba.ee.spi.orb.OperationFactory$ClassAction.operate(OperationFactory.java:297) at com.sun.corba.ee.spi.orb.OperationFactory$ComposeAction.operate(OperationFactory.java:520) at com.sun.corba.ee.impl.orb.PrefixParserAction.apply(PrefixParserAction.java:96) at com.sun.corba.ee.spi.orb.PropertyParser.parse(PropertyParser.java:84) at com.sun.corba.ee.spi.orb.ParserImplBase.init(ParserImplBase.java:77) at com.sun.corba.ee.impl.orb.ORBConfiguratorImpl.runUserConfigurators(ORBConfiguratorImpl.java:179) at com.sun.corba.ee.impl.orb.ORBConfiguratorImpl.configure(ORBConfiguratorImpl.java:170) at com.sun.corba.ee.impl.orb.ORBImpl.postInit(ORBImpl.java:625) at com.sun.corba.ee.impl.orb.ORBImpl.set_parameters(ORBImpl.java:704) at com.sun.corba.ee.impl.orb.ORBImpl.setParameters(ORBImpl.java:691) at com.sun.corba.ee.spi.osgi.ORBFactory.initialize(ORBFactory.java:107) at org.glassfish.enterprise.iiop.impl.GlassFishORBManager.initORB(GlassFishORBManager.java:581) at org.glassfish.enterprise.iiop.impl.GlassFishORBManager.getORB(GlassFishORBManager.java:263) at org.glassfish.enterprise.iiop.impl.GlassFishORBFactoryImpl.createORB(GlassFishORBFactoryImpl.java:93) at org.glassfish.enterprise.iiop.api.GlassFishORBHelper.getORB(GlassFishORBHelper.java:152) at org.glassfish.enterprise.iiop.api.GlassFishORBHelper.getProtocolManager(GlassFishORBHelper.java:219) at com.sun.ejb.containers.BaseContainer.initializeProtocolManager(BaseContainer.java:824) at com.sun.ejb.containers.BaseContainer.(BaseContainer.java:508) at com.sun.ejb.containers.MessageBeanContainer.(MessageBeanContainer.java:142)

P.S. I do see a message similar to what Sherry obseved earlier than CNF error on the same instance startup: [#|2011-12-20T21:29:46.664-0800|WARNING|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.resource.recovery|_ThreadID=17;_ThreadName=Thread-2;|RAR8037: exception while getting transaction-support for RAR [ jmsra ] , com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: No Classloader found for RA [ jmsra ]|#]

glassfishrobot commented 12 years ago

@glassfishrobot Commented hvilekar said: There is no change in packaging for glassfish-corba-orb.jar META-INF/MANIFEST.MF in the latest build matches with 3.1.1.

I ran the test twice, and application deploy worked fine - both before and after cluster restart.

[java] Deployed /home/hv51393/repo/v2-appserv-tests/build/module/archive/tx-ee-dblogs-mdb-web.war

This is not ORB issue.

glassfishrobot commented 12 years ago

@glassfishrobot Commented mvatkina said: There is no JTS in the picture, it's an ORB init error on MDB deployment.

Harshad, which condition (classActionException) gets triggered by this path:

at $Proxy158.classActionException(Unknown Source) at com.sun.corba.ee.spi.orb.OperationFactory$ClassAction.operate(OperationFactory.java:297) at com.sun.corba.ee.spi.orb.OperationFactory$ComposeAction.operate(OperationFactory.java:520) at com.sun.corba.ee.impl.orb.PrefixParserAction.apply(PrefixParserAction.java:96) at com.sun.corba.ee.spi.orb.PropertyParser.parse(PropertyParser.java:84) at com.sun.corba.ee.spi.orb.ParserImplBase.init(ParserImplBase.java:77) at com.sun.corba.ee.impl.orb.ORBConfiguratorImpl.runUserConfigurators(ORBConfiguratorImpl.java:179) at com.sun.corba.ee.impl.orb.ORBConfiguratorImpl.configure(ORBConfiguratorImpl.java:170) at com.sun.corba.ee.impl.orb.ORBImpl.postInit(ORBImpl.java:625) at com.sun.corba.ee.impl.orb.ORBImpl.set_parameters(ORBImpl.java:704) at com.sun.corba.ee.impl.orb.ORBImpl.setParameters(ORBImpl.java:691) at com.sun.corba.ee.spi.osgi.ORBFactory.initialize(ORBFactory.java:107)

glassfishrobot commented 12 years ago

@glassfishrobot Commented hvilekar said: I'm unable to duplicate the failure (Red Hat Enterprise Linux 6.1, JDK 1.6.0_30, GF 3.1.2 latest build). However, following the code path, the above exception seems to be coming from:

com.sun.corba.ee.impl.osgi.loader.OSGiListener.evaluate():196 return secureLoadClass( bundle, className ); << Throw exception here com.sun.corba.ee.impl.osgi.loader.OSGiListener.evaluate():187 Bundle bundle = getBundleForClass( className ) ; com.sun.corba.ee.spi.orb.OperationFactory.operate(): 294 Class<?> result = resolver.evaluate( className ) ;

% cd v2/appserv-tests/devtests/transaction/ee/dblogs/mdb % ant all-local

What configuration (OS/JDK/GlassFish Build) is the test failing consistently ?

glassfishrobot commented 12 years ago

@glassfishrobot Commented mvatkina said: Interesting. Hudson tests on x86 passed today. I run my tests on mac

glassfishrobot commented 12 years ago

@glassfishrobot Commented hvilekar said: Sherry, How is your test result looking with the latest 3.1.2 build ? Do you see following exception in the server log for any of the instances ?

org.omg.CORBA.BAD_OPERATION: WARNING: IOP01210054: Error while attempting to load class

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Harshad, I did not notice this error in previous and recent 3.1.2 build.

org.omg.CORBA.BAD_OPERATION: WARNING: IOP01210054: Error while attempting to load class

glassfishrobot commented 12 years ago

@glassfishrobot Commented hvilekar said: Please reopen if the test fails again.

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Reopen the bug since the intermittent failures are shown in the execution of 3.1.2 b17.

1) One of previously reported intermittent failure is shown on 3.1.2 build 17 tests, e.g. marina vatkina added a comment - 15/Dec/11 07:07 AM The test failed because instance2 was down on deploy The related test details are provided as below.

2) The test env can be view from hudson job. --One run with Derby with 1 intermittent failure http://sqe-hudson.us.oracle.com:8080/hudson/job/sherry-lc-core/

--Another run with Oracle DB with 3 intermittent failures. http://sqe-hudson.us.oracle.com:8080/hudson/view/Sherry_core/job/sherry-lc-oracle/

3) tx test report shows what types of tests have intermittent failure http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/html/test_results_txn.html jms/jdbc app with txlog in database has intermittent failure. jdbc only app with txlog in database, all passed. jms/jdbc app or jdbc only app with txlog in file system, all passed.

4) client test output shows when failure occurred. http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/output/txn_recovery_dblog.output The tests are execution in the following order tx-cliapp3-t1mql-dblog-killASInstanceAutomaticDelegatedRecoveryID pass tx-cliapp2-t2mql-dblog-killASInstanceAdmincliDelegatedRecoveryID pass tx-cliapp2-t3mql-dblog-killASInstanceAutomaticSelfRecoveryID pass tx-cliapp2-t4mql-dblog-killMQBrokerAutomaticSelfRecovery1ID fail tx-cliapp2-t4mql-dblog-killMQBrokerAutomaticSelfRecovery2ID fail tx-cliapp2-t5mql-dblog-killDBServerAdmincliSelfRecoveryID fail tx-cliweb4-t1-killASInstanceAdmincliDelegatedRecoveryID pass

From the first failure at, tx-cliapp2-t4mql-dblog-killMQBrokerAutomaticSelfRecovery1ID, instance2 was down on deploy.

deploy-common-ee: [echo] So you are using Enterprise Version eh ? [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false deploy --force=false --precompilejsp=false --verify=false --retrieve /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build/ee/i386_apg-sqe2_Linux/cliapp2/archive --generatermistubs=false --availabilityenabled=false --asyncreplication=true --target sqe-cluster --keepreposdir=false --keepfailedstubs=false --isredeploy=false --logreportederrors=true /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build/ee/i386_apg-sqe2_Linux/cliapp2/archive/cliapp2App.ear [exec] Application deployed with name cliapp2App. [exec] WARNING: Command _deploy did not complete successfully on server instance clustered_instance_2: remote failure: Failed to load the application on instance clustered_instance_2. The application will not run properly. Please fix your application and redeploy. [exec] Exception while loading the app : EJB Container initialization error. Please see server.log for more details. [exec] Command deploy completed with warnings.

[echo] Deployment on target server sqe-cluster successful

5) Related server.log Some server.log files are saved at http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/log/

http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/log/server_in2.log_2012-01-11T11-31-37_dbg --deployed normally for cliapp2 t3

[#|2012-01-11T09:50:34.619-0800|INFO|glassfish3.1.2|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin|_ThreadID=21;_ThreadName=Thread-2;|cliapp2App was successfully deployed in 6,704 milliseconds.|#]

--Failure appeared, after that deploy has failure on cliapp2 t4.

[#|2012-01-11T09:56:14.803-0800|INFO|glassfish3.1.2|javax.enterprise.system.container.ejb.com.sun.ejb.containers| _ThreadID=21;_ThreadName=Thread-2;|EJB5181:Portable JNDI names for EJB SessionBean: [java:global/cliapp2App/cliapp2-ejb/SessionBean, java:global/cliapp2App/cliapp2-ejb/SessionBean!tx.recovery.cliapp2.SessionLocal]|#]

[#|2012-01-11T09:56:17.421-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.corba.ORBUtil| _ThreadID=21;_ThreadName=Thread-2;| IOP00410016: Unable to create IIOP listener on the specified host all interfaces and port 43,700 org.omg.CORBA.COMM_FAILURE: SEVERE: IOP00410016: Unable to create IIOP listener on the specified host all interfaces and port 43,700 vmcid: OMG minor code: 16 completed: No at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39) at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27) at java.lang.reflect.Constructor.newInstance(Constructor.java:513) at com.sun.corba.ee.spi.orbutil.logex.corba.CorbaExtension.makeException(CorbaExtension.java:248) at com.sun.corba.ee.spi.orbutil.logex.corba.CorbaExtension.makeException(CorbaExtension.java:95) at com.sun.corba.ee.spi.orbutil.logex.WrapperGenerator.handleFullLogging(WrapperGenerator.java:387) at com.sun.corba.ee.spi.orbutil.logex.WrapperGenerator.access$400(WrapperGenerator.java:107) at com.sun.corba.ee.spi.orbutil.logex.WrapperGenerator$2.invoke(WrapperGenerator.java:511) at com.sun.corba.ee.spi.orbutil.proxy.CompositeInvocationHandlerImpl.invoke(CompositeInvocationHandlerImpl.java:99) at $Proxy160.createListenerFailed(Unknown Source) at com.sun.corba.ee.impl.transport.SocketOrChannelAcceptorImpl.initialize(SocketOrChannelAcceptorImpl.java:98) at com.sun.corba.ee.impl.transport.CorbaTransportManagerImpl.getAcceptors(CorbaTransportManagerImpl.java:248) at com.sun.corba.ee.impl.transport.CorbaTransportManagerImpl.addToIORTemplate(CorbaTransportManagerImpl.java:267) at com.sun.corba.ee.spi.oa.ObjectAdapterBase.initializeTemplate(ObjectAdapterBase.java:103) at com.sun.corba.ee.impl.oa.poa.POAImpl.initialize(POAImpl.java:553) at com.sun.corba.ee.impl.oa.poa.POAImpl.makeRootPOA(POAImpl.java:352) at com.sun.corba.ee.impl.oa.poa.POAFactory$1.evaluate(POAFactory.java:286) at com.sun.corba.ee.impl.orbutil.closure.Future.evaluate(Future.java:61) at com.sun.corba.ee.impl.resolver.LocalResolverImpl.resolve(LocalResolverImpl.java:55) at com.sun.corba.ee.impl.resolver.CompositeResolverImpl.resolve(CompositeResolverImpl.java:59) at com.sun.corba.ee.impl.orb.ORBImpl.resolve_initial_references(ORBImpl.java:1266) at com.sun.corba.ee.impl.naming.cosnaming.TransientNameService.initialize(TransientNameService.java:124) at com.sun.corba.ee.impl.naming.cosnaming.TransientNameService.(TransientNameService.java:93) at org.glassfish.enterprise.iiop.impl.PEORBConfigurator.configure(PEORBConfigurator.java:161) at com.sun.corba.ee.impl.orb.ORBConfiguratorImpl.runUserConfigurators(ORBConfiguratorImpl.java:185) at com.sun.corba.ee.impl.orb.ORBConfiguratorImpl.configure(ORBConfiguratorImpl.java:170) at com.sun.corba.ee.impl.orb.ORBImpl.postInit(ORBImpl.java:625) at com.sun.corba.ee.impl.orb.ORBImpl.set_parameters(ORBImpl.java:704) at com.sun.corba.ee.impl.orb.ORBImpl.setParameters(ORBImpl.java:691) at com.sun.corba.ee.spi.osgi.ORBFactory.initialize(ORBFactory.java:107) at org.glassfish.enterprise.iiop.impl.GlassFishORBManager.initORB(GlassFishORBManager.java:581) at org.glassfish.enterprise.iiop.impl.GlassFishORBManager.getORB(GlassFishORBManager.java:263) at org.glassfish.enterprise.iiop.impl.GlassFishORBFactoryImpl.createORB(GlassFishORBFactoryImpl.java:93) at org.glassfish.enterprise.iiop.api.GlassFishORBHelper.getORB(GlassFishORBHelper.java:152) at org.glassfish.enterprise.iiop.api.GlassFishORBHelper.getProtocolManager(GlassFishORBHelper.java:219) at com.sun.ejb.containers.BaseContainer.initializeProtocolManager(BaseContainer.java:824) at com.sun.ejb.containers.BaseContainer.(BaseContainer.java:508) at com.sun.ejb.containers.MessageBeanContainer.(MessageBeanContainer.java:142) at com.sun.ejb.containers.ContainerFactoryImpl.createContainer(ContainerFactoryImpl.java:121) at org.glassfish.ejb.startup.EjbApplication.loadContainers(EjbApplication.java:230) at org.glassfish.ejb.startup.EjbDeployer.load(EjbDeployer.java:299) at org.glassfish.ejb.startup.EjbDeployer.load(EjbDeployer.java:105) at org.glassfish.internal.data.ModuleInfo.load(ModuleInfo.java:186) at org.glassfish.internal.data.ApplicationInfo.load(ApplicationInfo.java:264) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:460) at com.sun.enterprise.v3.server.ApplicationLifecycle.deploy(ApplicationLifecycle.java:240) at org.glassfish.deployment.admin.InstanceDeployCommand.execute(InstanceDeployCommand.java:187) at com.sun.enterprise.v3.admin.CommandRunnerImpl$1.execute(CommandRunnerImpl.java:348) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:363) at com.sun.enterprise.v3.admin.CommandRunnerImpl.doCommand(CommandRunnerImpl.java:1066) at com.sun.enterprise.v3.admin.CommandRunnerImpl.access$1200(CommandRunnerImpl.java:95) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1291) at com.sun.enterprise.v3.admin.CommandRunnerImpl$ExecutionContext.execute(CommandRunnerImpl.java:1259) at com.sun.enterprise.v3.admin.AdminAdapter.doCommand(AdminAdapter.java:461) at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:212) at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:179) at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:238) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:746) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:662) Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.glassfish.enterprise.iiop.impl.IIOPSSLSocketFactory.createServerSocket(IIOPSSLSocketFactory.java:292) at com.sun.corba.ee.impl.transport.SocketOrChannelAcceptorImpl.initialize(SocketOrChannelAcceptorImpl.java:91) ... 61 more

| #] |

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: From P2 to P3.

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: See similar error in Solaris 11 and RequestAndApplicationScopeEJBMDB.zip http://java.net/jira/browse/GLASSFISH-18155 org.omg.CORBA.COMM_FAILURE: SEVERE: IOP00410016: Unable to create IIOP listener on the specified host asqe-sblade-8.us.oracle.com and port 0 vmcid: OMG minor code: 16 completed: No

glassfishrobot commented 12 years ago

@glassfishrobot Commented hvilekar said: instance2 log is showing Address already in use error - indicating the port is not available. Is that expected by the test ? If not, consider adding a small delay between instance restart and see if you get consistent results.


http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/log/server_in2.log_2012-01-11T11-31-37

Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.glassfish.enterprise.iiop.impl.IIOPSSLSocketFactory.createServerSocket(IIOPSSLSocketFactory.java:292) at com.sun.corba.ee.impl.transport.SocketOrChannelAcceptorImpl.initialize(SocketOrChannelAcceptorImpl.java:91) ... 61 more

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: I can't tell if IOP00410016 is root cause for tx intermittent failures.

I gave a pointer of the log so that other may find root cause around.

I do have time delay in tests for restart instance, e.g. around 15 seconds for most cases.

I checked server.log for this case, the IOP00410016 is shown at 55 seconds after last instance restart. http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/log/in2/server.log_2012-01-11T11-31-37 [#|2012-01-11T09:55:22.204-0800|INFO|glassfish3.1.2|ShoalLogger|_ThreadID=12;_ThreadName=Thread-2;|GMS1092: GMS View Change Received for group: sqe-cluster : Members in view for JOINED_AND_READY_EVENT(before change analysis) are : 1: MemberId: clustered_instance_1, MemberType: CORE, Address: 10.133.184.142:9134:228.9.200.101:10177:sqe-cluster:clustered_instance_1 2: MemberId: clustered_instance_2, MemberType: CORE, Address: 10.133.184.142:9161:228.9.200.101:10177:sqe-cluster:clustered_instance_2 3: MemberId: clustered_instance_3, MemberType: CORE, Address: 10.133.184.142:9178:228.9.200.101:10177:sqe-cluster:clustered_instance_3 4: MemberId: server, MemberType: SPECTATOR, Address: 10.133.184.142:9187:228.9.200.101:10177:sqe-cluster:server

| #] |

.....

[#|2012-01-11T09:56:17.421-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.corba.ORBUtil|_ThreadID=21;_ThreadName=Thread-2;|IOP00410016: Unable to create IIOP listener on the specified host all interfaces and port 43,700 org.omg.CORBA.COMM_FAILURE: SEVERE: IOP00410016: Unable to create IIOP listener on the specified host all interfaces and port 43,700 vmcid: OMG minor code: 16 completed: No at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method) ...... Caused by: java.net.BindException: Address already in use at sun.nio.ch.Net.bind(Native Method) at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:126) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:59) at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:52) at org.glassfish.enterprise.iiop.impl.IIOPSSLSocketFactory.createServerSocket(IIOPSSLSocketFactory.java:292) at com.sun.corba.ee.impl.transport.SocketOrChannelAcceptorImpl.initialize(SocketOrChannelAcceptorImpl.java:91) ... 61 more

| #] |

I compared 2 runs for IOP00410016 on in2. --3 intermittent tx failure with 16 times of IOP00410016 /net/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl/log/in2 -bash-3.2$ grep IOP00410016 server.log* | wc -l 16 -bash-3.2$

--1 intermittent tx failure with 22 times of IOP00410016 -bash-3.2$ pwd /net/asqe-logs/export1/3.1.2/Results/build17/core/lc_derby_rhl/log/in2 -bash-3.2$ grep IOP00410016 server.lo* | wc -l 22 -bash-3.2$ The test results are saved at http://agni-1.us.oracle.com/JSPWiki/Wiki.jsp?page=V312CoreTest RunID 39 Oracle DB and RunID 40 Derby

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: As discussed with Harshad, I added a sleep time (30 sec) between restart-cluster to see if the address can be released, w.r.t. Caused by: java.net.BindException: Address already in use From Harshad, some platforms take longer time than others to release the address. I repeated the previous core-local-cluster executions on the same platform, RHL5, tx tests got 100% pass without IOP00410016 in server.log, e.g. http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_derby_rhl2/html/summaryreport.html

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Another run with 30 seconds between cluster restart is finished with 2 intermittent failures for tx tests and 22 IOP004 error in execution http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl2/html/summaryreport.html

hudson@apg-sqe2% find . -name "." -exec grep -l -i "IOP004" {} \; ./in2/server.log_2012-01-13T11-21-16 hudson@apg-sqe2% pwd /net/asqe-logs/export1/3.1.2/Results/build17/core/lc_oracle4_rhl2/log hudson@apg-sqe2% grep IOP004 ./in2/server.log_2012-01-13T11-21-16 | wc -l 22 hudson@apg-sqe2%

The failure appeared in cliapp2 with tx log on file system at this time. deploy-common-ee: [echo] So you are using Enterprise Version eh ? [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false deploy --force=false --precompilejsp=false --verify=false --retrieve /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build/ee/i386_apg-sqe2_Linux/cliapp2/archive --generatermistubs=false --availabilityenabled=false --asyncreplication=true --target sqe-cluster --keepreposdir=false --keepfailedstubs=false --isredeploy=false --logreportederrors=true /export/hudson/workspace/sherry-lc-oracle/appserver-sqe/build/ee/i386_apg-sqe2_Linux/cliapp2/archive/cliapp2App.ear [exec] Application deployed with name cliapp2App. [exec] WARNING: Command _deploy did not complete successfully on server instance clustered_instance_2: remote failure: Failed to load the application on instance clustered_instance_2. The application will not run properly. Please fix your application and redeploy. [exec] Exception while loading the app : EJB Container initialization error. Please see server.log for more details. [exec] Command deploy completed with warnings. [echo] Deployment on target server sqe-cluster successful ..... [java] ----------------------------------------- [java] - tx-cliapp2-t4mql-killMQBrokerAutomaticSelfRecovery1: FAIL - [java] -----------------------------------------

glassfishrobot commented 12 years ago

@glassfishrobot Commented hvilekar said: The latest run is showing following error in the instance2 server log. Assigning to the jms team to review.

[#|2012-01-13T09:50:06.885-0800|WARNING|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.jtsxa|_ThreadID=22;_ThreadName=Thread-2;|JTS5067: Unexpected error occurred in commit javax.transaction.xa.XAException at com.sun.messaging.jmq.jmsclient.XAResourceForMC.commit(XAResourceForMC.java:248) at com.sun.jts.jtsxa.OTSResourceImpl.commit(OTSResourceImpl.java:114) at com.sun.jts.CosTransactions.RegisteredResources.distributeCommit(RegisteredResources.java:763) at com.sun.jts.CosTransactions.TopCoordinator.commit(TopCoordinator.java:2099) at com.sun.jts.CosTransactions.CoordinatorTerm.commit(CoordinatorTerm.java:412) at com.sun.jts.CosTransactions.TerminatorImpl.commit(TerminatorImpl.java:250) at com.sun.jts.CosTransactions.CurrentImpl.commit(CurrentImpl.java:633) at com.sun.jts.jta.TransactionManagerImpl.commit(TransactionManagerImpl.java:332) at com.sun.enterprise.transaction.jts.JavaEETransactionManagerJTSDelegate.commitDistributedTransaction(JavaEETransactionManagerJTSDelegate.java:185) at com.sun.enterprise.transaction.JavaEETransactionManagerSimplified.commit(JavaEETransactionManagerSimplified.java:861) at com.sun.ejb.containers.BaseContainer.completeNewTx(BaseContainer.java:5136) at com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4901) at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2045) at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:1994) at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:222) at com.sun.ejb.containers.EJBLocalObjectInvocationHandlerDelegate.invoke(EJBLocalObjectInvocationHandlerDelegate.java:88) at $Proxy211.testxa(Unknown Source) at tx.recovery.cliapp2.TestServlet.doGet(TestServlet.java:50) at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1542) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:281) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161) at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:174) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:746) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:662) Caused by: com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C01020000009F012FD86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(45712) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.throwServerErrorException(ProtocolHandler.java:4103) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.commit(ProtocolHandler.java:3600) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.commit(ProtocolHandler.java:3517) at com.sun.messaging.jmq.jmsclient.XAResourceForMC.commit(XAResourceForMC.java:226) ... 42 more Caused by: com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C01020000009F012FD86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(45712) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.throwServerErrorException(ProtocolHandler.java:4088) ... 45 more

glassfishrobot commented 12 years ago

@glassfishrobot Commented @amykang2020 said: To recovery a prepared transaction in MQ, it needs to connect to the broker that owns the transaction (see also RFE 6944363).

However further looking at where the "Uknown XID" JMSException (referenced in Marina's below email) occurred and the full stacktrace (see the end of this comment section) in the server log, the "Uknown XID" JMSException did not occur at transaction recovery time, it occurred in the servlet doGet() at container commit which is after the test injected failure between PREPARE and COMMIT to the connected broker while TM injected a sleep between PREPARE and COMMIT. Hence after the sleep, at COMMIT time, the MQ client runtime auto-reconnect to another broker (note: the test uses conventional cluster) which doesn't know about the PREPARED transaction because it owned by the killed broker, hence the "Unknown XID" exception. This is expected in conventional cluster. Afterward, the test will do transaction recovery to recover the PREPARED transaction.


[#|2012-01-13T09:48:36.826-0800|WARNING|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=22;_ThreadName=Thread-2;|JTS5057: FailPoint : [null]|#]

[#|2012-01-13T09:48:36.826-0800|WARNING|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=22;_ThreadName=Thread-2;|JTS5057: FailPoint : [2]|#]

==> the above line means tx sleeps between prepare and commit

(2) right after above:

[#|2012-01-13T09:48:51.194-0800|WARNING|glassfish3.1.2|javax.jms|_ThreadID=23;_ThreadName=Thread-2;|[I500]: Caught JVM Exception: java.io.EOFException: Trying to read 72 bytes. Already read 0 bytes.|#]

[#|2012-01-13T09:48:51.194-0800|WARNING|glassfish3.1.2|javax.jms|_ThreadID=24;_ThreadName=Thread-2;|[I500]: Caught JVM Exception: java.io.EOFException: Trying to read 72 bytes. Already read 0 bytes.|#]

[#|2012-01-13T09:48:51.196-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.inbound.message|_ThreadID=26;_ThreadName=Thread-2;|MQJMSRA_EC1101: onEvent:Connection Event:E206:[E206]: Connection closed. The connection is closed due to a network problem, broker crashed, or internal error: localhost:48686(60082), com.sun.messaging.jms.notification.ConnectionClosedEvent[source=BrokerAddress=localhost:48686(60082), ConnectionID=5909906714122617856, ReconnectEnabled: false, IsConnectedToHABroker: false]|#]

<...> (3) After mostly repeating the above logging entries, there is this recovery happening somehow:

[#|2012-01-13T09:49:24.221-0800|INFO|glassfish3.1.2|javax.jms.connection|_ThreadID=24;_ThreadName=Thread-2;|[I107]: Connection recover state: RECOVER_TRANSPORT_CONNECTED, broker: localhost:58686(45712)|#]

[#|2012-01-13T09:49:24.221-0800|INFO|glassfish3.1.2|javax.jms.connection|_ThreadID=24;_ThreadName=Thread-2;|[I107]: Connection recover state: RECOVER_STARTED, broker: localhost:58686(45712)|#]

[#|2012-01-13T09:49:24.221-0800|INFO|glassfish3.1.2|javax.jms.connection|_ThreadID=29;_ThreadName=Thread-2;|[I107]: Connection recover state: RECOVER_IN_PROCESS, broker: localhost:58686(45712)|#]

[#|2012-01-13T09:49:24.227-0800|INFO|glassfish3.1.2|javax.jms.connection|_ThreadID=29;_ThreadName=Thread-2;|[I107]: Connection recover state: RECOVER_SUCCEEDED, broker: localhost:58686(45712)|#]

[#|2012-01-13T09:49:24.228-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.outbound.connection|_ThreadID=28;_ThreadName=Thread-2;|MQJMSRA_CL1101: onEvent:Connection Event for mc=1 :xacId=:event:E301:[E301]: Connection reconnected to the broker: localhost:58686(45712), com.sun.messaging.jms.notification.ConnectionReconnectedEvent[source=BrokerAddress=localhost:58686(45712), ConnectionID=8108507757229997056, ReconnectEnabled: true, IsConnectedToHABroker: false]|#]

[#|2012-01-13T09:49:24.228-0800|INFO|glassfish3.1.2|javax.jms.connection|_ThreadID=29;_ThreadName=Thread-2;|[I107]: Connection recover state: RECOVER_INACTIVE, broker: localhost:58686(45712)|#]

[#|2012-01-13T09:49:24.234-0800|WARNING|glassfish3.1.2|javax.jms|_ThreadID=26;_ThreadName=Thread-2;|[C4003]: Error occurred on connection creation [localhost:48686]. - cause: java.net.ConnectException: Connection refused|#]

[#|2012-01-13T09:49:29.257-0800|SEVERE|glassfish3.1.2|javax.resourceadapter.mqjmsra.inbound.message|_ThreadID=26;_ThreadName=Thread-2;|MQJMSRA_EC4001: onException:reconnect success

<...> (4) And then what seems like transaction being committed:

[#|2012-01-13T09:50:06.840-0800|WARNING|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks|_ThreadID=22;_ThreadName=Thread-2;|JTS5057: FailPoint : [null]|#]

[#|2012-01-13T09:50:06.882-0800|WARNING|glassfish3.1.2|javax.jms|_ThreadID=22;_ThreadName=Thread-2;|[I500]: Caught JVM Exception: com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C01020000009F012FD86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(45712)|#]

[#|2012-01-13T09:50:06.882-0800|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=22;_ThreadName=Thread-2;|MQRA:XARFMC:commit:XAException-Exception=[COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C01020000009F012FD86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(45712)|#]

[#|2012-01-13T09:50:06.884-0800|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=22;_ThreadName=Thread-2;|com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C01020000009F012FD86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(45712) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.throwServerErrorException(ProtocolHandler.java:4103) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.commit(ProtocolHandler.java:3600) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.commit(ProtocolHandler.java:3517) at com.sun.messaging.jmq.jmsclient.XAResourceForMC.commit(XAResourceForMC.java:226) at com.sun.jts.jtsxa.OTSResourceImpl.commit(OTSResourceImpl.java:114)

The full stacktrace of the "Uknown XID" JMSException [#|2012-01-13T10:49:15.531-0800|WARNING|glassfish3.1.2|javax.jms|_ThreadID=21;_ThreadName=Thread-2;|[I500]: Caught JVM Exception: com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C0102000000AE6265D86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(42645)|#]

[#|2012-01-13T10:49:15.531-0800|SEVERE|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|MQRA:XARFMC:commit:XAException-Exception=[COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C0102000000AE6265D86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(42645)|#]

[#|2012-01-13T10:49:15.533-0800|INFO|glassfish3.1.2|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=21;_ThreadName=Thread-2;|com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C0102000000AE6265D86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(42645) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.throwServerErrorException(ProtocolHandler.java:4103) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.commit(ProtocolHandler.java:3600) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.commit(ProtocolHandler.java:3517) at com.sun.messaging.jmq.jmsclient.XAResourceForMC.commit(XAResourceForMC.java:226) at com.sun.jts.jtsxa.OTSResourceImpl.commit(OTSResourceImpl.java:114) at com.sun.jts.CosTransactions.RegisteredResources.distributeCommit(RegisteredResources.java:763) at com.sun.jts.CosTransactions.TopCoordinator.commit(TopCoordinator.java:2099) at com.sun.jts.CosTransactions.CoordinatorTerm.commit(CoordinatorTerm.java:412) at com.sun.jts.CosTransactions.TerminatorImpl.commit(TerminatorImpl.java:250) at com.sun.jts.CosTransactions.CurrentImpl.commit(CurrentImpl.java:633) at com.sun.jts.jta.TransactionManagerImpl.commit(TransactionManagerImpl.java:332) at com.sun.enterprise.transaction.jts.JavaEETransactionManagerJTSDelegate.commitDistributedTransaction(JavaEETransactionManagerJTSDelegate.java:185) at com.sun.enterprise.transaction.JavaEETransactionManagerSimplified.commit(JavaEETransactionManagerSimplified.java:861) at com.sun.ejb.containers.BaseContainer.completeNewTx(BaseContainer.java:5136) at com.sun.ejb.containers.BaseContainer.postInvokeTx(BaseContainer.java:4901) at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:2045) at com.sun.ejb.containers.BaseContainer.postInvoke(BaseContainer.java:1994) at com.sun.ejb.containers.EJBLocalObjectInvocationHandler.invoke(EJBLocalObjectInvocationHandler.java:222) at com.sun.ejb.containers.EJBLocalObjectInvocationHandlerDelegate.invoke(EJBLocalObjectInvocationHandlerDelegate.java:88) at $Proxy126.testxa(Unknown Source) at tx.recovery.cliapp3.TestServlet.doGet(TestServlet.java:50) at javax.servlet.http.HttpServlet.service(HttpServlet.java:668) at javax.servlet.http.HttpServlet.service(HttpServlet.java:770) at org.apache.catalina.core.StandardWrapper.service(StandardWrapper.java:1542) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:281) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:175) at org.apache.catalina.core.StandardPipeline.doInvoke(StandardPipeline.java:655) at org.apache.catalina.core.StandardPipeline.invoke(StandardPipeline.java:595) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:161) at org.apache.catalina.connector.CoyoteAdapter.doService(CoyoteAdapter.java:331) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:231) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:174) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:849) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:746) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1045) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:228) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:662) Caused by: com.sun.messaging.jms.JMSException: [COMMIT_TRANSACTION_REPLY(47)] [C4036]: A broker error occurred. :[404] Uknown XID 6170672D737165322C636C757374657265645F696E7374616E63655F322C5034333730302C0102000000AE6265D86170672D737165322C636C757374657265645F696E7374616E63655F322C503433373030 user=guest, broker=localhost:58686(42645) at com.sun.messaging.jmq.jmsclient.ProtocolHandler.throwServerErrorException(ProtocolHandler.java:4088) ... 45 more

| #] |

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: 2 executions of core cluster tests are completed on 3.1.2 b18. http://agni-1.us.oracle.com/JSPWiki/Wiki.jsp?page=V312CoreTest

One run has 100% pass on tx, see RunID45, http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_rhl/html/summaryreport.html

Another run has 2 failures on tx recovery, RunID 46. http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_oel/html/summaryreport.html http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_oel/html/test_results_txn.html tx-cliapp3-t1mql-dblog-killASInstanceAutomaticDelegatedRecoveryID tx-cliapp2-t2mql-dblog-killASInstanceAdmincliDelegatedRecoveryID The test scenarios for the 2 failures are TX_01L and TX_02L in http://agni-1.us.oracle.com/asqe-logs/export1/3.1/docs/sqe/txn/GF31TXRecoveryTestSpec.html

http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_oel/output/txn_recovery_dblog.output The client output for the first failure show the following:

1) deploy is OK on 3 instances

deploy-common-ee: [echo] So you are using Enterprise Version eh ? [exec] asadmin --host localhost --port 4848 --user admin --passwordfile /export/hudson/workspace/sonia-oel6-v312-clu-smon/appserver-sqe/build-config/adminpassword.txt --interactive=false --echo=true --terse=false deploy --force=false --precompilejsp=false --verify=false --retrieve /export/hudson/workspace/sonia-oel6-v312-clu-smon/appserver-sqe/build/ee/amd64_asqeopteron1_Linux/cliapp3/archive --generatermistubs=false --availabilityenabled=false --asyncreplication=true --target sqe-cluster --keepreposdir=false --keepfailedstubs=false --isredeploy=false --logreportederrors=true /export/hudson/workspace/sonia-oel6-v312-clu-smon/appserver-sqe/build/ee/amd64_asqeopteron1_Linux/cliapp3/archive/cliapp3App.ear [exec] Application deployed with name cliapp3App. [exec] Command deploy executed successfully. [echo] Deployment on target server sqe-cluster successful

2) insert on in2 operate-data: [echo] Executing test on port 48080 [echo] build.classes.dir /export/hudson/workspace/sonia-oel6-v312-clu-smon/appserver-sqe/build/ee/amd64_asqeopteron1_Linux/cliapp3/classes [java] WS HOME appserver-sqe [java] main: testSuiteID=tx-cliapp3- [java] main: testCase=operate_xa [java] invoking webclient at http://localhost:48080/cliapp3/test?xa

3) verify on on1 verify-data: [echo] Executing test on port 38080 [java] WS HOME appserver-sqe [java] main: testSuiteID=tx-cliapp3- [java] main: testCase=verify_xa [java] invoking webclient at http://localhost:38080/cliapp3/verify?xa [java] Processing line: got exception [java] Processing line: javax.ejb.EJBException: Transaction aborted

[java] - tx-cliapp3-t1mql-dblog-killASInstanceAutomaticDelegatedRecovery: FAIL -

The server.log on in2 show that No Classloader found for RA [ __ds_jdbc_ra ], which one of the error that I reported initially on build 14.

http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_oel/log/in2/server.log_2012-01-19T17-00-26

[#|2012-01-19T13:51:36.731-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service| _ThreadID=17;_ThreadName=Thread-2;|RAR8061: failed to load resource-adapter-config or RA [ ds_jdbc_ra ], com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: No Classloader found for RA [ ds_jdbc_ra ]|#]

[#|2012-01-19T13:51:36.738-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=17;_ThreadName=Thread-2;|RAR8060: Unable to lookup pool [ poolNontx ], javax.naming.NamingException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv=

{java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming} [Root exception is javax.naming.NameNotFoundException: pools]|#]

[#|2012-01-19T13:51:36.741-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service|_ThreadID=17;_ThreadName=Thread-2;|RAR6017 : Failed to get connection pool object jdbc/nontx via JNDI lookup : com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming}

|#]

[#|2012-01-19T13:51:36.917-0800|SEVERE|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.resource.recovery|_ThreadID=17;_ThreadName=Thread-2;|RAR7109: Error while loading jdbc resources during recovery : jdbc/nontx|#]

[#|2012-01-19T13:51:37.071-0800|INFO|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=17;_ThreadName=Thread-2;|JTS5014: Recoverable JTS instance, serverId = [100]|#]

.....

[#|2012-01-19T13:51:37.174-0800|SEVERE|glassfish3.1.2| javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=18;_ThreadName=Thread-2;| javax.naming.NamingException: Lookup failed for 'jdbc/nontx' in SerialContext[myEnv=

{java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming} [Root exception is javax.naming.NamingException: Unable to lookup resource : jdbc/nontx [Root exception is com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: Lookup failed for '__SYSTEM/pools/poolNontx' in SerialContext[myEnv={java.naming.factory.initial=com.sun.enterprise.naming.impl.SerialInitContextFactory, java.naming.factory.state=com.sun.corba.ee.impl.presentation.rmi.JNDIStateFactoryImpl, java.naming.factory.url.pkgs=com.sun.enterprise.naming}

]] at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:518) at com.sun.enterprise.naming.impl.SerialContext.lookup(SerialContext.java:455) at javax.naming.InitialContext.lookup(InitialContext.java:392)

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: The tests are executed in a cluster of 3 instances. The cliapp3 is the first test in the execution. Since multiple tests are in execution, I listed a few time stamps of in2 log for both fail and pass cases to help reading the log. We need to look for why "No Classloader found for RA" happened. I saw that "Recovery of Inbound Transactions started." appears early in failed case, but did not know if this is related to the failure.

http://sqe-hudson.us.oracle.com:8080/hudson/job/sonia-oel6-v312-clu-smon/

3 b18, the first test of execution, cliapp3, failed

http://sqe-hudson.us.oracle.com:8080/hudson/job/sherry-lc/

9, b18, the first test of execution, cliapp3, passed

1) cliapp3 failed in one run.

http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_oel/log/in2/server.log_2012-01-19T17-00-26

1.1) deployed cliapp3

[#|2012-01-19T13:47:54.206-0800|INFO|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.inbound| _ThreadID=18;_ThreadName=Thread-2;|Recovery of Inbound Transactions started.|#]

.......

[#|2012-01-19T13:51:35.424-0800|INFO|glassfish3.1.2|ShoalLogger|_ThreadID=12;_ThreadName=Thread-2;| GMS1025: Adding Joined And Ready member: clustered_instance_3 group: sqe-cluster StartupState: GROUP_STARTUP |#]

[#|2012-01-19T13:51:37.162-0800|SEVERE|glassfish3.1.2| javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.service| _ThreadID=18;_ThreadName=Thread-2;|RAR8061: failed to load resource-adapter-config or RA [ ds_jdbc_ra ], com.sun.appserv.connectors.internal.api.ConnectorRuntimeException: No Classloader found for RA [ ds_jdbc_ra ]|#]

........

[#|2012-01-19T13:51:42.090-0800|INFO|glassfish3.1.2|javax.enterprise.resource.jms.com.sun.enterprise.connectors.jms.system| _ThreadID=10;_ThreadName=Thread-2;|JMS010: ADDRESSLIST in setJmsServiceProvider: mq://localhost:48686/,mq://localhost:58686/,mq://localhost:38686/|#]

[#|2012-01-19T13:51:42.090-0800|INFO|glassfish3.1.2|javax.enterprise.resource.jms.com.sun.enterprise.connectors.jms.system| _ThreadID=10;_ThreadName=Thread-2;|JMS08: JMS Service Connection URL is : mq://localhost:48686/,mq://localhost:58686/,mq://localhost:38686/|#]

[#|2012-01-19T13:51:42.984-0800|INFO|glassfish3.1.2|org.hibernate.validator.util.Version|_ThreadID=10;_ThreadName=Thread-2;| Hibernate Validator 4.2.0.Final|#]

[#|2012-01-19T13:51:43.851-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.lifecycle|_ThreadID=10;_ThreadName=Thread-2;| MQJMSRA_RA1101: GlassFish MQ JMS Resource Adapter: Version: 4.5.2 (Build 2-d) Compile: Thu Dec 8 17:30:48 PST 2011|#]

[#|2012-01-19T13:51:43.856-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.lifecycle|_ThreadID=10;_ThreadName=Thread-2;| MQJMSRA_RA1101: GlassFish MQ JMS Resource Adapter starting: broker is LOCAL, connection mode is TCP|#]

[#|2012-01-19T13:51:48.985-0800|WARNING|glassfish3.1.2|javax.jms|_ThreadID=10;_ThreadName=Thread-2;| [C4003]: Error occurred on connection creation [localhost:48686]. - cause: java.net.ConnectException: Connection refused|#]

........

[#|2012-01-19T13:53:20.885-0800|INFO|glassfish3.1.2| javax.enterprise.system.tools.admin.org.glassfish.deployment.admin| _ThreadID=21;_ThreadName=Thread-2;| cliapp3App was successfully deployed in 19,341 milliseconds.|#]

1.2) insert to database, and tx at prepared stage. [#|2012-01-19T13:53:44.590-0800|WARNING|glassfish3.1.2| javax.enterprise.system.core.transaction.com.sun.jts.utils.RecoveryHooks| _ThreadID=22; _ThreadName=Thread-2;|JTS5057: FailPoint : [2]|#]

1.3) deployed cliapp2, the next test, don't need to look further. [#|2012-01-19T14:07:40.600-0800|INFO|glassfish3.1.2| javax.enterprise.system.tools.admin.org.glassfish.deployment.admin| _ThreadID=21;_ThreadName=Thread-2;| cliapp2App was successfully deployed in 14,274 milliseconds.|#]

2) cliapp3 passed in another run.

http://agni-1.us.oracle.com/asqe-logs/export1/3.1.2/Results/build18/core/lc_sm_rhl/log/in2/server.log_2012-01-19T17-08-17 2.1) deployed cliapp3

[#|2012-01-19T15:04:59.796-0800|INFO|glassfish3.1.2|ShoalLogger|_ThreadID=12;_ThreadName=Thread-2;| GMS1025: Adding Joined And Ready member: clustered_instance_3 group: sqe-cluster StartupState: GROUP_STARTUP |#]

...... [#|2012-01-19T15:05:02.392-0800|INFO|glassfish3.1.2|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.inbound| _ThreadID=17;_ThreadName=Thread-2;|Recovery of Inbound Transactions started.|#]

[#|2012-01-19T15:05:03.052-0800|INFO|glassfish3.1.2|javax.enterprise.resource.jms.com.sun.enterprise.connectors.jms.system| _ThreadID=10;_ThreadName=Thread-2;|JMS010: ADDRESSLIST in setJmsServiceProvider: mq://localhost:48686/,mq://localhost:58686/,mq://localhost:38686/|#]

[#|2012-01-19T15:05:03.053-0800|INFO|glassfish3.1.2|javax.enterprise.resource.jms.com.sun.enterprise.connectors.jms.system| _ThreadID=10;_ThreadName=Thread-2;|JMS08: JMS Service Connection URL is : mq://localhost:48686/,mq://localhost:58686/,mq://localhost:38686/|#]

[#|2012-01-19T15:05:03.404-0800|INFO|glassfish3.1.2|org.hibernate.validator.util.Version|_ThreadID=10;_ThreadName=Thread-2;| Hibernate Validator 4.2.0.Final|#]

[#|2012-01-19T15:05:04.761-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.lifecycle|_ThreadID=10;_ThreadName=Thread-2;| MQJMSRA_RA1101: GlassFish MQ JMS Resource Adapter: Version: 4.5.2 (Build 2-d) Compile: Thu Dec 8 17:30:48 PST 2011|#]

[#|2012-01-19T15:05:04.761-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.lifecycle|_ThreadID=10;_ThreadName=Thread-2;| MQJMSRA_RA1101: GlassFish MQ JMS Resource Adapter starting: broker is LOCAL, connection mode is TCP|#]

[#|2012-01-19T15:05:05.499-0800|INFO|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions| _ThreadID=17;_ThreadName=Thread-2;|JTS5014: Recoverable JTS instance, serverId = [100]|#]

[#|2012-01-19T15:05:06.502-0800|INFO|glassfish3.1.2|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions| _ThreadID=18;_ThreadName=Thread-2;|JTS5078: Table for transaction logging was not added.|#] ........

[#|2012-01-19T15:05:07.697-0800|INFO|glassfish3.1.2|javax.resourceadapter.mqjmsra.lifecycle|_ThreadID=10;_ThreadName=Thread-2;| MQJMSRA_RA1101: GlassFish MQ JMS Resource Adapter Started:LOCAL|#]

.......

[#|2012-01-19T15:05:41.618-0800|INFO|glassfish3.1.2|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin| _ThreadID=21;_ThreadName=Thread-2;|cliapp3App was successfully deployed in 7,444 milliseconds.|#]

glassfishrobot commented 12 years ago

@glassfishrobot Commented sherryshen said: Since #17971 is a following up of #17789. I summarized my observations from both bugs below: auto-delegated recovery with dblog in appserver-sqe/pe/transaction/recovery/cliapp3 --it always passed on b10, 10+ runs, the build had part of the support of tx db log. --it always failed on b11, 10+ runs, the build had more support of tx db log. --it had intermittent failure on b14, b18 with "No Classloader found for RA".

glassfishrobot commented 12 years ago

@glassfishrobot Commented @h2002044 said: I am able to reproduce the issue and have a fix. I tried the fix multiple times, locally and no reported exceptions were seen. Thanks to Sherry for trying this couple of times. I have also started a new run. http://sqe-hudson.us.oracle.com:8080/hudson/job/sherry-tx-lc-dbg/63/

Fix is to make sure that the global connector-class-loader object is initialized (assigned) only after the connector-class-loader chain (of all resource-adapters) is fully constructed.

glassfishrobot commented 12 years ago

@glassfishrobot Commented @h2002044 said: What is the impact on the customer of the bug?

How likely is it that a customer will see the bug and how serious is the bug?

Is it a regression? Does it meet other bug fix criteria (security, performance, etc.)?

What is the cost/risk of fixing the bug?

How risky is the fix? How much work is the fix? Is the fix complicated?

Is there an impact on documentation or message strings?

Which tests should QA (re)run to verify the fix did not destabilize GlassFish?

Which is the targeted build of 3.1.2 for this fix?