Propagate Senders HM.Entry seqid in sent HealthMessage

eclipse-ee4j / glassfish-shoal

Shoal

Other

5 stars 9 forks source link

Propagate Senders HM.Entry seqid in sent HealthMessage #81

Closed glassfishrobot closed 14 years ago

glassfishrobot commented 15 years ago

Requires changes in HealthMessage.initialize() and getDocument().

HealthMessage.getDocument() should write the HealthMessage.Entry.sequenceId into its XML document representation. HealthMessage.initialize() should read the senders sequence id for health message.entry from XML document representation.

Currently, the receiver side is just creating a sequence id based on order of receiving messages. Jxta messaging protocol does not guarantee that messages are received in precise order that they were sent, so the current sequencing mechanism could be resulting in out of order processing of health messages. This could result in incorrect computed cache state for an instance in the master node.

Environment

Operating System: All Platform: All

Affected Versions

[current]

glassfishrobot commented 6 years ago

Issue Imported From: https://github.com/javaee/shoal/issues/81
Original Issue Raised By:@glassfishrobot
Original Issue Assigned To: @jfialli
Original Issue Closed By:@glassfishrobot

glassfishrobot commented 15 years ago

@glassfishrobot Commented Reported by @jfialli

glassfishrobot commented 15 years ago

@glassfishrobot Commented @jfialli said: Created an attachment (id=12) server log summarizing out of order message processing

glassfishrobot commented 15 years ago

@glassfishrobot Commented @jfialli said: https://shoal.dev.java.net/nonav/issues/showattachment.cgi/12/unexpectedfailure.log

Following attachment summarizes a failure that occurs due to this defect. Messages are sent by instance in following order: aliveandready clusterstopping stopping

The DAS (master node) receives the messages in the following order: stopping (receiving side seqid 960) clusterstopping (receiving side seqid 961) aliveandready (receiving side seqid 963)

The DAS processes the message in following order: clusterstopping (961) stopping(960) aliveandready (963)

The aliveandready message being processed last makes a stopped instance appear to come back to life as far as Master is concerned. It is then marked as INDOUBT by master and then verified FAILED. Must correct this ordering issue to fix this.

glassfishrobot commented 15 years ago

@glassfishrobot Commented @jfialli said: Fix delivered. Senders sequence id is now propagated.

Also, use start time of member and sequence id to order messages between one invocation and a restart invocation of server instance. (Nodeagent can restart a failed instance quickly so this can happen)

glassfishrobot commented 15 years ago

@glassfishrobot Commented File: unexpectedfailure.log Attached By: @jfialli

glassfishrobot commented 7 years ago

@glassfishrobot Commented This issue was imported from java.net JIRA SHOAL-81

glassfishrobot commented 14 years ago

@glassfishrobot Commented Marked as fixed on Wednesday, June 23rd 2010, 4:11:06 am