eclipse-ee4j / glassfish

Eclipse GlassFish
https://eclipse-ee4j.github.io/glassfish/
378 stars 143 forks source link

Regression. The deployment of generic-ra.rar failed with a timeout message. #15677

Closed glassfishrobot closed 13 years ago

glassfishrobot commented 13 years ago

Build 39. Linux, Windows machines.

Created a cluster with two instances on the same machine.

Tried to deploy to the cluster generic-ra.rar. After 10 minutes saw a timeot message.

Then tried to stop and start a cluster. The cluster did not start, but created a timeout message.

Please see the attached server.log of one instance. To reproduce the issue just execute this hudson job: http://sqe-hudson.us.oracle.com:8080/hudson/view/Elena_Jobs/job/deployment/

I saw this issue on Windows, Linux, Mac. This is a regression issue, it was not seen before.

glassfishrobot commented 6 years ago
glassfishrobot commented 13 years ago

@glassfishrobot Commented @honghzzhang said: elena: I was not able to reproduce this issue with the latest workspace. I created a cluster with two instances on my linux machine and I was able to deploy the rar without any problem. Can you attach the commands you used to create the cluster and deploy the rar? And is this problem only with rar and not with other type of applications?

glassfishrobot commented 13 years ago

@glassfishrobot Commented @honghzzhang said: Forgot to ask this in the last comment, were you able to reproduce the problem with just deploying this single application or it can be only reproduced with running the entire test suite?

I noticed the following stack trace in the attached server.log with the deployment time out. Not sure if it's something relate to the grizzly code, I will write to the relevant people to see if they are aware of anything.

[#|2011-01-24T19:45:36.412-0800|SEVERE|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=21;_ThreadName=Thread-1;|service exception java.lang.RuntimeException: ClientAbortException: java.nio.channels.ClosedChannelException at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:254) at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:168) at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:234) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:818) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:719) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1008) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:225) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:662) Caused by: ClientAbortException: java.nio.channels.ClosedChannelException at com.sun.grizzly.tcp.http11.GrizzlyOutputBuffer.doFlush(GrizzlyOutputBuffer.java:439) at com.sun.grizzly.tcp.http11.GrizzlyOutputBuffer.flush(GrizzlyOutputBuffer.java:405) at com.sun.grizzly.tcp.http11.GrizzlyOutputStream.flush(GrizzlyOutputStream.java:140) at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:251) ... 17 more Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) at com.sun.grizzly.util.OutputWriter.flushChannel(OutputWriter.java:108) at com.sun.grizzly.util.OutputWriter.flushChannel(OutputWriter.java:76) at com.sun.grizzly.http.SocketChannelOutputBuffer.flushChannel(SocketChannelOutputBuffer.java:326) at com.sun.grizzly.http.SocketChannelOutputBuffer.flushBuffer(SocketChannelOutputBuffer.java:398) at com.sun.grizzly.http.SocketChannelOutputBuffer.flush(SocketChannelOutputBuffer.java:376) at com.sun.grizzly.http.ProcessorTask.action(ProcessorTask.java:1236) at com.sun.grizzly.tcp.Response.action(Response.java:268) at com.sun.grizzly.tcp.http11.GrizzlyOutputBuffer.doFlush(GrizzlyOutputBuffer.java:434) ... 20 more

| #] |

glassfishrobot commented 13 years ago

@glassfishrobot Commented @honghzzhang said: elena: This was with 01/24 b39 nightly? When was the last time it still worked, that will help us narrow down the cause.

glassfishrobot commented 13 years ago

@glassfishrobot Commented easarina said: I believe I was able to narrow the issue. I've installed glassfish on one machine. created a cluster with two instances, then added a setting that I'm using to be able to deploy/execute timer apps. I've executed exactly such commands:

$S1AS_HOME/bin/asadmin stop-instance in1 $S1AS_HOME/bin/asadmin stop-instance in2 $S1AS_HOME/bin/asadmin stop-domain $S1AS_HOME/bin/asadmin start-domain $S1AS_HOME/bin/asadmin set configs.config.$CLUSTER-config.ejb-container.ejb-timer-service.tim er-datasource=jdbc/__default $S1AS_HOME/bin/asadmin create-resource-ref --target $CLUSTER jdbc/__default $S1AS_HOME/bin/asadmin start-local-instance --node localhost-domain1 in1 $S1AS_HOME/bin/asadmin start-local-instance --node localhost-domain1 in2

Then I've got a timeout during the deployment of generic-ra.rar

glassfishrobot commented 13 years ago

@glassfishrobot Commented easarina said: I was able to reproduce this issue on all machines Windows, Solaris, Mac, Linux

glassfishrobot commented 13 years ago

@glassfishrobot Commented @honghzzhang said: Thanks elena! Can you also attach the commands that you created the cluster so I have the exact same steps as you to reproduce the problem. Jagadish was also trying to reproduce the problem (with latest workspace + b39-24-Jan-2011 nightly) and could not reproduce either, so there must be some small differences in the commands which we used.

glassfishrobot commented 13 years ago

@glassfishrobot Commented easarina said: For the installation were used such simple steps:

! /bin/sh

. ./conf.txt $AS_HOME/bin/asadmin stop-instance in1 $AS_HOME/bin/asadmin stop-instance in2 $AS_HOME/bin/asadmin stop-domain

/bin/rm -rf $INST_DIR/glassfish3 /bin/rm -rf $INST_DIR/g.zip cd $INST_DIR wget -O g.zip http://download.java.net/glassfish/v3.1/nightly/latest-glassfish.zip unzip -q g.zip $AS_HOME/bin/asadmin start-domain $AS_HOME/bin/asadmin create-cluster c1 $AS_HOME/bin/asadmin create-local-instance --cluster c1 --systemproperties HTTP_LISTENER_PORT=18080:HTTP_SSL_LIST ENER_PORT=18181:IIOP_SSL_LISTENER_PORT=13800:IIOP_LISTENER_PORT=13700:JMX_SYSTEM_CONNECTOR_PORT=17676:IIOP_SSL_MU TUALAUTH_PORT=13801:JMS_PROVIDER_PORT=18686:ASADMIN_LISTENER_PORT=14848 in1

$AS_HOME/bin/asadmin create-local-instance --cluster c1 --systemproperties HTTP_LISTENER_PORT=28080:HTTP_SSL_LIS TENER_PORT=28181:IIOP_SSL_LISTENER_PORT=23800:IIOP_LISTENER_PORT=23700:JMX_SYSTEM_CONNECTOR_PORT=27676:IIOP_SSL_M UTUALAUTH_PORT=23801:JMS_PROVIDER_PORT=28686:ASADMIN_LISTENER_PORT=24848 in2 $AS_HOME/bin/asadmin start-local-instance in1 $AS_HOME/bin/asadmin start-local-instance in2

glassfishrobot commented 13 years ago

@glassfishrobot Commented easarina said: I've executed again these two simple scripts (i.e. installed the build, created a cluster with two instances, created that resource) and then just tried to deploy the app, got timeout and then timeouts when tried to stop and start a cluster.

glassfishrobot commented 13 years ago

@glassfishrobot Commented @honghzzhang said: Yes I can reproduce the problem with the steps you provided now. Thanks! I am trying to further narrow down the steps/problem now..

glassfishrobot commented 13 years ago

@glassfishrobot Commented easarina said: I've tried just wait for a long time and don't do anything after I saw: sadmin deploy --target c1 archives_nodb/generic-ra.rar No response from Domain Admin Server after 600 seconds. The command is either taking too long to complete or the server has failed. Please see the server log files for command status. Command deploy failed.

Then in the instance server.log I saw such messages:

[#|2011-01-25T11:22:13.695-0800|INFO|glassfish3.1|org.hibernate.validator.util.Version|_ThreadID=22;_ThreadName=Thread-1;|Hibernate Validator 4.1.0.Final|#]

[#|2011-01-25T11:22:13.715-0800|INFO|glassfish3.1|org.hibernate.validator.engine.resolver.DefaultTraversableResolver|_ThreadID=22;_ThreadName=Thread-1;|Instantiated an instance of org.hibernate.validator.engine.resolver.JPATraversableResolver.|#]

[#|2011-01-25T11:22:18.528-0800|INFO|glassfish3.1|org.hibernate.validator.engine.resolver.DefaultTraversableResolver|_ThreadID=22;_ThreadName=Thread-1;|Instantiated an instance of org.hibernate.validator.engine.resolver.JPATraversableResolver.|#]

[#|2011-01-25T11:22:18.560-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> constructor...|#]

[#|2011-01-25T11:22:18.826-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> setTestName called... name = ConfigPropertyForRA|#]

[#|2011-01-25T11:22:19.079-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> 001. Simple RA start...|#]

[#|2011-01-25T11:22:19.080-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> 002. Simple RA start...|#]

[#|2011-01-25T11:22:19.081-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> 003. Simple RA start...|#]

[#|2011-01-25T11:22:19.082-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> 004. Simple RA start...|#]

[#|2011-01-25T11:22:19.113-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=134;_ThreadName=Thread-1;|TestWMWork[1000].start running|#]

[#|2011-01-25T11:22:20.127-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=133;_ThreadName=Thread-1;|TestWMWork[8888].start running|#]

[#|2011-01-25T11:22:21.130-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=133;_ThreadName=Thread-1;|TestWMWork[8888].stop running|#]

[#|2011-01-25T11:22:21.201-0800|INFO|glassfish3.1|javax.enterprise.system.core.transaction.com.sun.jts.CosTransactions|_ThreadID=136;_ThreadName=Thread-1;|JTS5014: Recoverable JTS instance, serverId = [100]|#]

[#|2011-01-25T11:22:21.313-0800|INFO|glassfish3.1|javax.enterprise.resource.resourceadapter.com.sun.enterprise.connectors.inbound|_ThreadID=138;_ThreadName=Thread-1;|Recovery of Inbound Transactions started.|#]

[#|2011-01-25T11:37:12.440-0800|WARNING|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=12;_ThreadName=Thread-1;|GRIZZLY0023: Interrupting idle Thread: http-thread-pool-14848(4).|#]

[#|2011-01-25T11:37:12.453-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;| ex = java.lang.InterruptedException, error code: 0|#]

[#|2011-01-25T11:37:12.454-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=122;_ThreadName=Thread-1;|[SimpleResourceAdapterImpl] ==> 005. Simple RA start...|#]

[#|2011-01-25T11:37:14.092-0800|INFO|glassfish3.1|javax.enterprise.system.tools.admin.org.glassfish.deployment.admin|_ThreadID=122;_ThreadName=Thread-1;|generic-ra was successfully deployed in 901,194 milliseconds.|#]

[#|2011-01-25T11:37:14.111-0800|SEVERE|glassfish3.1|com.sun.grizzly.config.GrizzlyServiceListener|_ThreadID=22;_ThreadName=Thread-1;|service exception java.lang.RuntimeException: ClientAbortException: java.nio.channels.ClosedChannelException at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:254) at com.sun.grizzly.tcp.http11.GrizzlyAdapter.service(GrizzlyAdapter.java:168) at com.sun.enterprise.v3.server.HK2Dispatcher.dispath(HK2Dispatcher.java:117) at com.sun.enterprise.v3.services.impl.ContainerMapper.service(ContainerMapper.java:234) at com.sun.grizzly.http.ProcessorTask.invokeAdapter(ProcessorTask.java:818) at com.sun.grizzly.http.ProcessorTask.doProcess(ProcessorTask.java:719) at com.sun.grizzly.http.ProcessorTask.process(ProcessorTask.java:1008) at com.sun.grizzly.http.DefaultProtocolFilter.execute(DefaultProtocolFilter.java:225) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:137) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:104) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:90) at com.sun.grizzly.http.HttpProtocolChain.execute(HttpProtocolChain.java:79) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:54) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:59) at com.sun.grizzly.ContextTask.run(ContextTask.java:71) at com.sun.grizzly.util.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:532) at com.sun.grizzly.util.AbstractThreadPool$Worker.run(AbstractThreadPool.java:513) at java.lang.Thread.run(Thread.java:662) Caused by: ClientAbortException: java.nio.channels.ClosedChannelException at com.sun.grizzly.tcp.http11.GrizzlyOutputBuffer.doFlush(GrizzlyOutputBuffer.java:439) at com.sun.grizzly.tcp.http11.GrizzlyOutputBuffer.flush(GrizzlyOutputBuffer.java:405) at com.sun.grizzly.tcp.http11.GrizzlyOutputStream.flush(GrizzlyOutputStream.java:140) at com.sun.enterprise.v3.admin.AdminAdapter.service(AdminAdapter.java:251) ... 17 more Caused by: java.nio.channels.ClosedChannelException at sun.nio.ch.SocketChannelImpl.ensureWriteOpen(SocketChannelImpl.java:126) at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:324) at com.sun.grizzly.util.OutputWriter.flushChannel(OutputWriter.java:108) at com.sun.grizzly.util.OutputWriter.flushChannel(OutputWriter.java:76) at com.sun.grizzly.http.SocketChannelOutputBuffer.flushChannel(SocketChannelOutputBuffer.java:326) at com.sun.grizzly.http.SocketChannelOutputBuffer.flushBuffer(SocketChannelOutputBuffer.java:398) at com.sun.grizzly.http.SocketChannelOutputBuffer.flush(SocketChannelOutputBuffer.java:376) at com.sun.grizzly.http.ProcessorTask.action(ProcessorTask.java:1236) at com.sun.grizzly.tcp.Response.action(Response.java:268) at com.sun.grizzly.tcp.http11.GrizzlyOutputBuffer.doFlush(GrizzlyOutputBuffer.java:434) ... 20 more

| #] |

[#|2011-01-25T11:37:14.118-0800|INFO|glassfish3.1|org.hibernate.validator.engine.resolver.DefaultTraversableResolver|_ThreadID=26;_ThreadName=Thread-1;|Instantiated an instance of org.hibernate.validator.engine.resolver.JPATraversableResolver.|#]

[#|2011-01-25T11:37:14.719-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=136;_ThreadName=Thread-1;|TestWMWork[9999].start running|#]

[#|2011-01-25T11:37:15.730-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=136;_ThreadName=Thread-1;|TestWMWork[9999].stop running|#]

[#|2011-01-25T11:37:15.732-0800|INFO|glassfish3.1|javax.enterprise.system.std.com.sun.enterprise.server.logging|_ThreadID=134;_ThreadName=Thread-1;|TestWMWork[1000].stop running|#]

glassfishrobot commented 13 years ago

@glassfishrobot Commented @honghzzhang said: Yes, this stack trace was in the server.log you attached earlier too. We had some discussions on this and it seems due to transaction timeout was expired, but not sure why the transaction was expired yet.

I had spent some time trying to further narrow down the steps and the following seems the simplest set of steps to reproduce the problem:

asadmin start-domain asadmin create-cluster c1 asadmin create-local-instance --cluster c1 --systemproperties HTTP_LISTENER_PORT=18080:HTTP_SSL_LISTENER_PORT=18181:IIOP_SSL_LISTENER_PORT=13800:IIOP_LISTENER_PORT=13700:JMX_SYSTEM_CONNECTOR_PORT=17676:IIOP_SSL_MUTUALAUTH_PORT=13801:JMS_PROVIDER_PORT=18686:ASADMIN_LISTENER_PORT=14848 in1 asadmin create-local-instance --cluster c1 --systemproperties HTTP_LISTENER_PORT=28080:HTTP_SSL_LISTENER_PORT=28181:IIOP_SSL_LISTENER_PORT=23800:IIOP_LISTENER_PORT=23700:JMX_SYSTEM_CONNECTOR_PORT=27676:IIOP_SSL_MUTUALAUTH_PORT=23801:JMS_PROVIDER_PORT=28686:ASADMIN_LISTENER_PORT=24848 in2 asadmin create-resource-ref --target c1 jdbc/__default asadmin start-local-instance --node localhost-domain1 in1 asadmin start-local-instance --node localhost-domain1 in2 asadmin deploy --target c1 generic-ra.rar

A few observations: 1. If we remove the create-resource-ref step, the deployment went through successfully. 2. It does not matter if the create-resource-ref happens before or after the cluster instances are started. 3. I tried to deploy an enterprise application with ejb and servlet and the deployment went through successfully.

So somehow the create-resource-ref step did not work well with the particular rar application, I will assign to Jagadish for further evaluation.

glassfishrobot commented 13 years ago

@glassfishrobot Commented @h2002044 said: The use-case works fine in b32-dec-8-2010 nightly and starts failing from b32-dec-10-2010 nightly.

I could narrow down to the svn revision "43607".

It looks like the bug fix for 14809 is causing (or exposing) this issue. http://java.net/jira/browse/GLASSFISH-14809

Once I reverted the changes for the above fix, the use-case reported in issue 15677 works fine. http://java.net/jira/browse/GLASSFISH-15677

glassfishrobot commented 13 years ago

@glassfishrobot Commented @h2002044 said: 1. How bad is its impact? (Severity)

glassfishrobot commented 13 years ago

@glassfishrobot Commented @h2002044 said: svn log -v -r 44745

Modified Paths:

trunk/v3/connectors/connectors-runtime/src/main/java/com/sun/enterprise/connectors/service/ResourceAdapterAdminServiceImpl.java trunk/v3/connectors/connectors-runtime/src/main/java/com/sun/enterprise/connectors/ConnectorRegistry.java

Fix will be available in (first) RC build.

Tested the following scenario and is working fine without deadlock/timeout issue.

Install GlassFish $AS_HOME/bin/asadmin start-domain $AS_HOME/bin/asadmin create-cluster c1 $AS_HOME/bin/asadmin create-local-instance --cluster c1 --systemproperties HTTP_LISTENER_PORT=18080:HTTP_SSL_LIST ENER_PORT=18181:IIOP_SSL_LISTENER_PORT=13800:IIOP_LISTENER_PORT=13700:JMX_SYSTEM_CONNECTOR_PORT=17676:IIOP_SSL_MU TUALAUTH_PORT=13801:JMS_PROVIDER_PORT=18686:ASADMIN_LISTENER_PORT=14848 in1

$AS_HOME/bin/asadmin create-local-instance --cluster c1 --systemproperties HTTP_LISTENER_PORT=28080:HTTP_SSL_LIS TENER_PORT=28181:IIOP_SSL_LISTENER_PORT=23800:IIOP_LISTENER_PORT=23700:JMX_SYSTEM_CONNECTOR_PORT=27676:IIOP_SSL_M UTUALAUTH_PORT=23801:JMS_PROVIDER_PORT=28686:ASADMIN_LISTENER_PORT=24848 in2 $AS_HOME/bin/asadmin start-local-instance in1 $AS_HOME/bin/asadmin start-local-instance in2 $AS_HOME/bin/asadmin deploy --target c1 generic-ra.rar $AS_HOME/bin/asadmin stop-cluster c1 $AS_HOME/bin/asadmin start-cluster c1

NOTE : This test resource-adapter tries to start a transactional work in ResourceAdapter.start() which is not a recommended practise.

glassfishrobot commented 13 years ago

@glassfishrobot Commented File: generic-ra.rar Attached By: easarina

glassfishrobot commented 13 years ago

@glassfishrobot Commented File: server.log Attached By: easarina

glassfishrobot commented 7 years ago

@glassfishrobot Commented This issue was imported from java.net JIRA GLASSFISH-15677

glassfishrobot commented 13 years ago

@glassfishrobot Commented Reported by easarina

glassfishrobot commented 13 years ago

@glassfishrobot Commented Marked as fixed on Thursday, January 27th 2011, 12:02:25 pm