eclipse-ee4j / glassfish-shoal

Shoal
Other
5 stars 9 forks source link

very intermittent - ABSTRACT_TRANSPORT BRANCH: dropped Shoal message(using Grizzly transport) in distributed system testing #101

Closed glassfishrobot closed 13 years ago

glassfishrobot commented 14 years ago

Running HAMessageBuddyReplicationSimulator (see shoal workspace developer test script runHAMessageBuddyReplicationSimulator.sh) on a distributed group of 9 instances, 1 out of 5 times in running entire test, there will be the following message drop.

The test is confirming a dropped message when the 2 exceptions below occur in server logs.

Message test output detecting a dropped message.

Never received objectId:45 msgId:248, from:106

106: FAILED. Confirmed (1) messages were dropped

Here is the matching exception.

[#|2010-03-18T11:18:41.831-0700|WARNING|Shoal|ShoalLogger|_ThreadID=26;_ThreadName=-WorkerThread(31);ClassName=NetworkUtility;MethodName=deserialize;|NetworkUtility.deserialized current objects: messages=

{NAD=com.sun.enterprise.ee.cms.impl.base.SystemAdvertisementImpl@e8f7fdef, targetPeerId=192.168.46.109:9130:2299:cluster1:n1c1m9, sourcePeerId=192.168.46.108:9130:2299:cluster1:n1c1m8}

failed while deserializing name=APPMESSAGE java.io.StreamCorruptedException: invalid type code: 58 at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1356) at java.io.ObjectInputStream.defaultReadFields(ObjectInputStream.java:1947) at java.io.ObjectInputStream.readSerialData(ObjectInputStream.java:1871) at java.io.ObjectInputStream.readOrdinaryObject(ObjectInputStream.java:1753) at java.io.ObjectInputStream.readObject0(ObjectInputStream.java:1329) at java.io.ObjectInputStream.readObject(ObjectInputStream.java:351) at com.sun.enterprise.mgmt.transport.NetworkUtility.deserialize(NetworkUtility.java:419) at com.sun.enterprise.mgmt.transport.MessageImpl.readMessagesFromBytes(MessageImpl.java:233) at com.sun.enterprise.mgmt.transport.MessageImpl.parseMessage(MessageImpl.java:214) at com.sun.enterprise.mgmt.transport.grizzly.GrizzlyMessageProtocolParser.hasNextMessage(GrizzlyMessageProtocolParser.java:140) at com.sun.grizzly.filter.ParserProtocolFilter.execute(ParserProtocolFilter.java:139) at com.sun.grizzly.DefaultProtocolChain.executeProtocolFilter(DefaultProtocolChain.java:135) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:102) at com.sun.grizzly.DefaultProtocolChain.execute(DefaultProtocolChain.java:88) at com.sun.grizzly.ProtocolChainContextTask.doCall(ProtocolChainContextTask.java:53) at com.sun.grizzly.SelectionKeyContextTask.call(SelectionKeyContextTask.java:57) at com.sun.grizzly.NIOContext.execute(NIOContext.java:510) at com.sun.grizzly.SelectorHandlerRunner.handleSelectedKey(SelectorHandlerRunner.java:357) at com.sun.grizzly.SelectorHandlerRunner.handleSelectedKeys(SelectorHandlerRunner.java:257) at com.sun.grizzly.SelectorHandlerRunner.doSelect(SelectorHandlerRunner.java:194) at com.sun.grizzly.SelectorHandlerRunner.run(SelectorHandlerRunner.java:129) at com.sun.grizzly.util.FixedThreadPool$BasicWorker.dowork(FixedThreadPool.java:379) at com.sun.grizzly.util.FixedThreadPool$BasicWorker.run(FixedThreadPool.java:360) at java.lang.Thread.run(Thread.java:619)

| #] |

Mar 18, 2010 11:18:41 AM com.sun.enterprise.mgmt.transport.grizzly.GrizzlyMessageProtocolParser hasNextMessage WARNING: hasNextMessage() Thread:-WorkerThread(31),position:6744,nextMsgStartPos:0,expectingMoreData:true,hasMoreBytesToParse:false,error:false,msg size:5405,message: MessageImpl[v1:CLUSTER_MANAGER_MESSAGE:NAD, Target: 192.168.46.109:9130:2299:cluster1:n1c1m9 , Source: 192.168.46.108:9130:2299:cluster1:n1c1m8, com.sun.enterprise.mgmt.transport.MessageIOException: failed to deserialize a message : name = APPMESSAGE

Have not seen this issue occur running checked in runHAMessageBuddyReplicationSimulator.sh on single machine with 10 instances in cluster. Will double check this by running it several times.

Also, verifying that there is no message drops when running shoal over jxta transport.

Environment

Operating System: All Platform: All

Affected Versions

[current]

glassfishrobot commented 6 years ago
glassfishrobot commented 14 years ago

@glassfishrobot Commented Reported by @jfialli

glassfishrobot commented 14 years ago

@glassfishrobot Commented @jfialli said: accepting issue.

glassfishrobot commented 14 years ago

@glassfishrobot Commented @jfialli said: Created an attachment (id=22) shoal logs running test runHAMessageBuddyReplicationSimulator with exception in grizzly transport layer receiving the message

glassfishrobot commented 14 years ago

@glassfishrobot Commented @jfialli said: Created an attachment (id=23) server log of instance sending message that was lost on instance106 - nothing in log that is helpful. Just added for completeness that issue is only showing on receiving side, no send error noted.

glassfishrobot commented 13 years ago

@glassfishrobot Commented @jfialli said: The NPE is fixed in grizzly.

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: shoal_bug_101_instance105.log Attached By: @jfialli

glassfishrobot commented 14 years ago

@glassfishrobot Commented File: shoal_bug_101_instance106.log Attached By: @jfialli

glassfishrobot commented 7 years ago

@glassfishrobot Commented This issue was imported from java.net JIRA SHOAL-101

glassfishrobot commented 13 years ago

@glassfishrobot Commented Marked as fixed on Thursday, October 7th 2010, 3:32:32 am