openlvc / portico

Portico is an open source, cross-platform, fully supported HLA RTI implementation. Designed with modularity and flexibility in mind, Portico is a production-grade RTI for the Simulation and Training Community, so come say hi!
http://www.porticoproject.org
152 stars 81 forks source link

Notify Federation when unexpected Federate crash #162

Closed icemagno closed 8 years ago

icemagno commented 8 years ago

HI, I don't know if I made some mistake. I need to be notified ( in an Ambassador event ) when a Federate crashes.

My test:

One federate running and using a NullFederateAmbassador extended class with all methods( with some out.println just to debug)

The RID file is ON ( TRACE log level )

Other Federate (different class) running and updating some attributes every 5 seconds.

I can see the logger output in the first Federate every time attribute changes in the second Federate.

Now, I just stop the second Federate execution by closing the program (without letting the Federate do resign stuff ) to simulate a machine crash or something like..

The first Federate logger simply stop update without telling me what was happened with the second guy.

am I doing something wrong or is this the correct behaviour ?

timpokorny commented 8 years ago

G'day mate,

What method are you expecting a callback on?

The MOM can typically handle this. A Manager.Federate object is created for each federate and if a federate goes away (gracefully or otherwise) that object is deleted.

Unfortunately the MOM for that section is only properly implemented for HLA 1.3 at the moment. Not in 1516e :(

There was a reason for this being non-trivial. I can have a look at weekend about how to bring it in or another way to support the use case.

icemagno commented 8 years ago

I think I get your point. After some ressearch I found this document:

http://aegistg.com/Technical_Papers/HLA%20to%20the%20WebSept00.pdf

Page 4:

fed_manager

A more (older) detailed one:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.39.1509&rep=rep1&type=pdf

A look into HLAStandardMIM.xml and I found HLAmanager.HLAfederate instead Manager.Federate.

INFO [localhost-startStop-1] portico.lrc: MOM support is currently unsupported in IEEE-1516e federations.

:-(

Thanks!

timpokorny commented 8 years ago

Yes - there was some reason this got pushed off. I'll take a sit down on the w/e and have a look to see if it can get in quickly for this base stuff and report back. Cheers, Tim

icemagno commented 8 years ago

Fantastic! You're always giving the best support ! But take it easy. Don't spend your w/e solving problems. Go surf! Many thanks pal.

timpokorny commented 8 years ago

G'day mate, still working through this but making some progress (haven't forgotten you).

icemagno commented 8 years ago

No problem pal. At your time.

timpokorny commented 8 years ago

G'day mate - STILL hacking away at this. I recall all the grand reasons why this wasn't simple now. It is a rabbit hole. Most of the work tracking over on ticket 55 (link above) if you want to see.

icemagno commented 8 years ago

Be cool. I have time. I'll focus in other problems ( regions and DDM ) until you solve this. I appreciate your efforts and concern to keep me updated.

icemagno commented 8 years ago

Hi Tim, I want to invite you to send criticism, suggestions and opinions to my new blog about HLA ( more precisely about Portico RTI ). Will be a honor if you want to correct me ( i'm not a HLA guru ), my english ( not very good ) and a bigger honor if you want to be a poster there. My text is snippet-like step-by-step tutorials because was hard to me to find simple tutorials on the web. I know you're giga-busy and will understand you don't to.

http://sim.cmabreu.com.br/

Thanks.

timpokorny commented 8 years ago

@icemagno - reference #55 for details. Good to go I hope :S

icemagno commented 8 years ago

Well, here is :

This is the Federate 01 start log ( Sagitarii ) in a Federation called Sagitarii ( too ): The FOM is the Standard MIM and the SOM is the same in both Federates.

DEBUG [localhost-startStop-1] portico.lrc: Creating new LRC
DEBUG [localhost-startStop-1] portico.lrc: Portico version: 2.1.0 (build 3)
DEBUG [localhost-startStop-1] portico.lrc: Interface: IEEE1516e
TRACE [localhost-startStop-1] portico.lrc: Provided connection implementation is "org.portico.bindings.jgroups.JGroupsConnection"
TRACE [localhost-startStop-1] portico.lrc: Trying to load connection class: org.portico.bindings.jgroups.JGroupsConnection
TRACE [localhost-startStop-1] portico.lrc: ATTEMPT create IConnection, class= class org.portico.bindings.jgroups.JGroupsConnection
TRACE [localhost-startStop-1] portico.lrc: SUCCESS created IConnection, class= class org.portico.bindings.jgroups.JGroupsConnection
TRACE [localhost-startStop-1] portico.lrc: Applying modules using component keyword: lrc1516e
TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc13-callback] to LRC
TRACE [localhost-startStop-1] portico.lrc: Applied [0/24] handlers
TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc1516-callback] to LRC
TRACE [localhost-startStop-1] portico.lrc: Applied [0/11] handlers
TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc1516e-callback] to LRC
TRACE [localhost-startStop-1] portico.lrc: Applied [24/24] handlers
TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc-base] to LRC
TRACE [localhost-startStop-1] portico.lrc: Applied [82/92] handlers
DEBUG [localhost-startStop-1] portico.lrc: Messaging framework configuration complete
INFO  [localhost-startStop-1] portico.lrc: LRC initialized (HLA version: IEEE1516e)
INFO  [localhost-startStop-1] portico.lrc: Opening LRC Connection
INFO  [localhost-startStop-1] portico.lrc.jgroups: jgroups connection is up and running
DEBUG [ImmediateCallbackDispatcher] portico.lrc: Starting immediate callback delivery processor
DEBUG [localhost-startStop-1] portico.lrc.fom: Parsing FED file (format=ieee1516e): file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/HLAstandardMIM.xml
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Create federation execution [Sagitarii]
TRACE [localhost-startStop-1] portico.lrc.jgroups: ATTEMPT Connecting to channel [Sagitarii]

-------------------------------------------------------------------
GMS: address=AKRAB-19517, cluster=Sagitarii, physical address=192.168.25.7:61388
-------------------------------------------------------------------
DEBUG [localhost-startStop-1] portico.lrc.jgroups: SUCCESS Connected to channel [Sagitarii]
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=0, source=AKRAB-19517
DEBUG [Regular] portico.lrc.jgroups: (GMS) findCoordinator(AKRAB-19517)
INFO  [localhost-startStop-1] portico.lrc.jgroups: No co-ordinator found - appointing myself!
DEBUG [localhost-startStop-1] portico.lrc.jgroups: REQUEST createFederation: name=Sagitarii
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=19416, source=AKRAB-19517
DEBUG [Regular] portico.lrc.jgroups: (GMS) createFederation(AKRAB-19517)
DEBUG [Regular] portico.lrc.jgroups: Received federation creation notification: federation=Sagitarii, fomSize=19416b, source=6376a034-829a-4efa-8e8e-a39850943e16
INFO  [Regular] portico.lrc.jgroups: Federation [Sagitarii] has been created
INFO  [localhost-startStop-1] portico.lrc.jgroups: SUCCESS createFederation: name=Sagitarii
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Created federation execution [Sagitarii]
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Join federate [Sagitarii] to federation [Sagitarii]
DEBUG [localhost-startStop-1] portico.lrc.fom: Parsing FED file (format=ieee1516e): file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/SagitariiFederation.xml
DEBUG [localhost-startStop-1] portico.lrc: Parsed [1] additional FOM modules
DEBUG [localhost-startStop-1] portico.lrc.jgroups: Validate that [1] modules can merge successfully with the existing FOM
TRACE [localhost-startStop-1] portico.lrc.merger: Beginning merge of 2 FOM models
TRACE [localhost-startStop-1] portico.lrc.merger: Merging [file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/SagitariiFederation.xml] into combined FOM
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAobjectRoot.SagitariiServer]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAobjectRoot.Core]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAobjectRoot.Teapot]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAinteractionRoot.RunInstance]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAinteractionRoot.RequestTask]
DEBUG [localhost-startStop-1] portico.lrc.jgroups: Modules can be merged successfully, continue with join
DEBUG [localhost-startStop-1] portico.lrc.jgroups: REQUEST joinFederation: federate=Sagitarii, federation=Sagitarii
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=9, source=AKRAB-19517
DEBUG [Regular] portico.lrc.jgroups: (GMS) joinFederation(AKRAB-19517)
DEBUG [Regular] portico.lrc.jgroups: Received federate join notification: federate=Sagitarii, federation=Sagitarii, source=6376a034-829a-4efa-8e8e-a39850943e16
INFO  [Regular] portico.lrc.jgroups: Federate [Sagitarii] joined federation [Sagitarii]
INFO  [localhost-startStop-1] portico.lrc.jgroups: SUCCESS Joined federation with name=Sagitarii
DEBUG [localhost-startStop-1] portico.lrc.jgroups: Merging 1 additional FOM modules that we receive with join request
TRACE [localhost-startStop-1] portico.lrc.merger: Beginning merge of 2 FOM models
TRACE [localhost-startStop-1] portico.lrc.merger: Merging [file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/SagitariiFederation.xml] into combined FOM
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAobjectRoot.Core]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAobjectRoot.SagitariiServer]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAobjectRoot.Teapot]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAinteractionRoot.RequestTask]
TRACE [localhost-startStop-1] portico.lrc.merger:    -> Inserting class [HLAinteractionRoot.RunInstance]
TRACE [localhost-startStop-1] portico.lrc: Created Mom.Federation object, added to Repository (undiscovered)
TRACE [localhost-startStop-1] portico.lrc: Created Mom.Federate(Sagitarii), queued discovery notification
TRACE [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received MOM object discovery for federate [MOM.Federate(Sagitarii)]
DEBUG [ImmediateCallbackDispatcher] portico.lrc: DISCARD Discovery of object (not subscribed): object=1
TRACE [localhost-startStop-1] portico.lrc: joined federation, waiting for RoleCalls from [1]
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Joined federate [Sagitarii] to federation [Sagitarii]: handle=1
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=3949, source=AKRAB-19517
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Publish [744] with attributes [745]
TRACE [localhost-startStop-1] portico.lrc: NOTICE  Implicitly adding privToDelete
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Published [744] with attributes [502, 745]
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=402, source=AKRAB-19517
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Register instance of class [744] , name=Sagitarii Server
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Register instance of class [744], object handle=2
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=290, source=AKRAB-19517
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Update object [2], attributes [745] (RO)
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=108, source=AKRAB-19517
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Updated object [2], attributes [745] (RO)
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Subscribe to [746] with attributes [752, 753, 747, 748, 749, 750, 751]
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Subscribeed to [746] with attributes [752, 753, 747, 748, 749, 750, 751]
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=483, source=AKRAB-19517
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Subscribe to [736] with attributes [737, 738, 739, 740, 741, 742, 743]
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Subscribeed to [736] with attributes [737, 738, 739, 740, 741, 742, 743]
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=483, source=AKRAB-19517
DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Subscribe to [2] with attributes [5, 7, 14]
TRACE [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received MOM object discovery for federate [MOM.Federate(Sagitarii)]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [1] registeredAs=2, discoveredAs=2
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=1,class=2,name=MOM.Federate(Sagitarii))
DEBUG [localhost-startStop-1] portico.lrc: Queued Discover callback for instance [1] after subscription to class [2]
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
INFO  [localhost-startStop-1] portico.lrc: SUCCESS Subscribeed to [2] with attributes [5, 7, 14]
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=443, source=AKRAB-19517

Now, lets start the Federate 02 ( Teapot Node ) in same machine ( AKRAB ) : This is the log result added to the above ( All log output printed here came from Federate 01 ).

TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=0, source=AKRAB-51112
DEBUG [Regular] portico.lrc.jgroups: (GMS) findCoordinator(AKRAB-51112)
DEBUG [Regular] portico.lrc.jgroups: Received request for manifest from 4e06bacb-2ee7-4ec6-901b-92fe3e48acaa
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=21442, source=AKRAB-19517
DEBUG [Regular] portico.lrc.jgroups: (GMS) setManifest(AKRAB-19517)
DEBUG [Regular] portico.lrc.jgroups: Sent manifest (21442b) to 4e06bacb-2ee7-4ec6-901b-92fe3e48acaa
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=29, source=AKRAB-51112
DEBUG [Regular] portico.lrc.jgroups: (GMS) joinFederation(AKRAB-51112)
DEBUG [Regular] portico.lrc.jgroups: Received federate join notification: federate=Teapot Node 00-13-46-94-18-C1, federation=Sagitarii, source=4e06bacb-2ee7-4ec6-901b-92fe3e48acaa
INFO  [Regular] portico.lrc.jgroups: Federate [Teapot Node 00-13-46-94-18-C1] joined federation [Sagitarii]
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=3754, source=AKRAB-51112
DEBUG [Regular] portico.lrc: @REMOTE RoleCall received [handle:2,name:Teapot Node 00-13-46-94-18-C1] by local federate [Sagitarii]
TRACE [Regular] portico.lrc: Created Mom.Federate(Teapot Node 00-13-46-94-18-C1), queued discovery notification
DEBUG [Regular] portico.lrc: Merging 1 additional FOM modules from [Teapot Node 00-13-46-94-18-C1]
TRACE [Regular] portico.lrc.merger: Beginning merge of 2 FOM models
TRACE [Regular] portico.lrc.merger: Merging [file:/F:/sagitarii2/teapot/foms/SagitariiFederation.xml] into combined FOM
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=21866, source=AKRAB-19517
TRACE [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received MOM object discovery for federate [MOM.Federate(Teapot Node 00-13-46-94-18-C1)]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097151] registeredAs=2, discoveredAs=2
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097151,class=2,name=MOM.Federate(Teapot Node 00-13-46-94-18-C1))
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=574, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=462, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=309, source=AKRAB-51112
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Federate [Teapot Node 00-13-46-94-18-C1] published object class [746] with attributes [752, 753, 502, 747, 748, 749, 750, 751]
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097153, class=746, owned=[753, 752, 749, 748, 502, 747, 751, 750]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097153] registeredAs=746, discoveredAs=746
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097153,class=746,name=Teapot Node)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=218, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=462, source=AKRAB-51112
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received object UPDATE [2097153] with attributes [752, 753, 747, 748, 749, 750, 751] (RO)
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112
TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK reflectAttributeValues(object=2097153,attributes={752(40b),753(16b),747(8b),748(4b),749(8b),750(8b),751(28b)}) (RO)
TeapotObject:setFreeMemory: 8001520
TeapotObject:setCpuLoad: 11.0
TeapotObject:setTotalMemory: 16252928
TRACE [ImmediateCallbackDispatcher] portico.lrc:          reflectAttributeValues() callback complete
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Federate [Teapot Node 00-13-46-94-18-C1] published object class [736] with attributes [737, 738, 739, 740, 741, 742, 502, 743]
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097154, class=736, owned=[741, 739, 740, 742, 743, 737, 502, 738]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097154] registeredAs=736, discoveredAs=736
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097154,class=736,name=HLA2097154)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097155, class=736, owned=[740, 742, 743, 738, 741, 502, 737, 739]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097155] registeredAs=736, discoveredAs=736
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097155,class=736,name=HLA2097155)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097156, class=736, owned=[742, 739, 741, 743, 738, 740, 502, 737]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097156] registeredAs=736, discoveredAs=736
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097156,class=736,name=HLA2097156)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097157, class=736, owned=[740, 738, 742, 502, 743, 739, 741, 737]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097157] registeredAs=736, discoveredAs=736
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097157,class=736,name=HLA2097157)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097158, class=736, owned=[737, 740, 738, 742, 741, 739, 743, 502]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097158] registeredAs=736, discoveredAs=736
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097158,class=736,name=HLA2097158)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete
DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097159, class=736, owned=[739, 743, 741, 502, 742, 738, 737, 740]
INFO  [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097159] registeredAs=736, discoveredAs=736
TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097159,class=736,name=HLA2097159)
TRACE [ImmediateCallbackDispatcher] portico.lrc:          discoverObjectInstance() callback complete

I see all normal. Federate 01 log is telling me Federate 02 is in the game.

Now, let's simulate the Federate 02 ( Teapot Node ) crash by simply closing your running terminal. A big bada-boom!

Terminal closed. I don't give time to Federate 02 to resign or even know what happen.

At this time, I expected to see some log in Federate 01 telling me the epic dead of Federate 02.. but not even a line of log was printed.... none. Federate 01 ( and the RTI ) still thinking Federate 02 is still alive. I can proof this by trying to start Federate 02 again... a name conflict error raises on Federate 02 screen.

I see a kind of trouble because the Federate don't have a place to still connected ( like a Central RTI Component ). AFAIK I will need to pool every Federate ( maybe an Interaction ) to discover who is still alive, but it may inject some noise into the network.

timpokorny commented 8 years ago

Poo. Let me look into it. It might take a little bit to show up as JGroups has to detect that it has gone down, then confirm, and then notify. Let me dig in. Cheers, Tim

On 2 Feb 2016, at 7:53 am, Carlos Magno Oliveira de Abreu notifications@github.com<mailto:notifications@github.com> wrote:

Well, here is :

This is the Federate 01 start log ( Sagitarii ) in a Federation called Sagitarii ( too ): The FOM is the Standard MIM and the SOM is the same in both Federates.

DEBUG [localhost-startStop-1] portico.lrc: Creating new LRC DEBUG [localhost-startStop-1] portico.lrc: Portico version: 2.1.0 (build 3) DEBUG [localhost-startStop-1] portico.lrc: Interface: IEEE1516e TRACE [localhost-startStop-1] portico.lrc: Provided connection implementation is "org.portico.bindings.jgroups.JGroupsConnection" TRACE [localhost-startStop-1] portico.lrc: Trying to load connection class: org.portico.bindings.jgroups.JGroupsConnection TRACE [localhost-startStop-1] portico.lrc: ATTEMPT create IConnection, class= class org.portico.bindings.jgroups.JGroupsConnection TRACE [localhost-startStop-1] portico.lrc: SUCCESS created IConnection, class= class org.portico.bindings.jgroups.JGroupsConnection TRACE [localhost-startStop-1] portico.lrc: Applying modules using component keyword: lrc1516e TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc13-callback] to LRC TRACE [localhost-startStop-1] portico.lrc: Applied [0/24] handlers TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc1516-callback] to LRC TRACE [localhost-startStop-1] portico.lrc: Applied [0/11] handlers TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc1516e-callback] to LRC TRACE [localhost-startStop-1] portico.lrc: Applied [24/24] handlers TRACE [localhost-startStop-1] portico.lrc: STARTING Apply module [lrc-base] to LRC TRACE [localhost-startStop-1] portico.lrc: Applied [82/92] handlers DEBUG [localhost-startStop-1] portico.lrc: Messaging framework configuration complete INFO [localhost-startStop-1] portico.lrc: LRC initialized (HLA version: IEEE1516e) INFO [localhost-startStop-1] portico.lrc: Opening LRC Connection INFO [localhost-startStop-1] portico.lrc.jgroups: jgroups connection is up and running DEBUG [ImmediateCallbackDispatcher] portico.lrc: Starting immediate callback delivery processor DEBUG [localhost-startStop-1] portico.lrc.fom: Parsing FED file (format=ieee1516e): file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/HLAstandardMIM.xml DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Create federation execution [Sagitarii] TRACE [localhost-startStop-1] portico.lrc.jgroups: ATTEMPT Connecting to channel [Sagitarii]


GMS: address=AKRAB-19517, cluster=Sagitarii, physical address=192.168.25.7:61388

DEBUG [localhost-startStop-1] portico.lrc.jgroups: SUCCESS Connected to channel [Sagitarii] TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=0, source=AKRAB-19517 DEBUG [Regular] portico.lrc.jgroups: (GMS) findCoordinator(AKRAB-19517) INFO [localhost-startStop-1] portico.lrc.jgroups: No co-ordinator found - appointing myself! DEBUG [localhost-startStop-1] portico.lrc.jgroups: REQUEST createFederation: name=Sagitarii TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=19416, source=AKRAB-19517 DEBUG [Regular] portico.lrc.jgroups: (GMS) createFederation(AKRAB-19517) DEBUG [Regular] portico.lrc.jgroups: Received federation creation notification: federation=Sagitarii, fomSize=19416b, source=6376a034-829a-4efa-8e8e-a39850943e16 INFO [Regular] portico.lrc.jgroups: Federation [Sagitarii] has been created INFO [localhost-startStop-1] portico.lrc.jgroups: SUCCESS createFederation: name=Sagitarii INFO [localhost-startStop-1] portico.lrc: SUCCESS Created federation execution [Sagitarii] DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Join federate [Sagitarii] to federation [Sagitarii] DEBUG [localhost-startStop-1] portico.lrc.fom: Parsing FED file (format=ieee1516e): file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/SagitariiFederation.xml DEBUG [localhost-startStop-1] portico.lrc: Parsed [1] additional FOM modules DEBUG [localhost-startStop-1] portico.lrc.jgroups: Validate that [1] modules can merge successfully with the existing FOM TRACE [localhost-startStop-1] portico.lrc.merger: Beginning merge of 2 FOM models TRACE [localhost-startStop-1] portico.lrc.merger: Merging [file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/SagitariiFederation.xml] into combined FOM TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAobjectRoot.SagitariiServer] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAobjectRoot.Core] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAobjectRoot.Teapot] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAinteractionRoot.RunInstance] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAinteractionRoot.RequestTask] DEBUG [localhost-startStop-1] portico.lrc.jgroups: Modules can be merged successfully, continue with join DEBUG [localhost-startStop-1] portico.lrc.jgroups: REQUEST joinFederation: federate=Sagitarii, federation=Sagitarii TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=9, source=AKRAB-19517 DEBUG [Regular] portico.lrc.jgroups: (GMS) joinFederation(AKRAB-19517) DEBUG [Regular] portico.lrc.jgroups: Received federate join notification: federate=Sagitarii, federation=Sagitarii, source=6376a034-829a-4efa-8e8e-a39850943e16 INFO [Regular] portico.lrc.jgroups: Federate [Sagitarii] joined federation [Sagitarii] INFO [localhost-startStop-1] portico.lrc.jgroups: SUCCESS Joined federation with name=Sagitarii DEBUG [localhost-startStop-1] portico.lrc.jgroups: Merging 1 additional FOM modules that we receive with join request TRACE [localhost-startStop-1] portico.lrc.merger: Beginning merge of 2 FOM models TRACE [localhost-startStop-1] portico.lrc.merger: Merging [file:/C:/Users/Magno/workspace/.metadata/.plugins/org.eclipse.wst.server.core/tmp2/wtpwebapps/sagitarii2/foms/SagitariiFederation.xml] into combined FOM TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAobjectRoot.Core] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAobjectRoot.SagitariiServer] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAobjectRoot.Teapot] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAinteractionRoot.RequestTask] TRACE [localhost-startStop-1] portico.lrc.merger: -> Inserting class [HLAinteractionRoot.RunInstance] TRACE [localhost-startStop-1] portico.lrc: Created Mom.Federation object, added to Repository (undiscovered) TRACE [localhost-startStop-1] portico.lrc: Created Mom.Federate(Sagitarii), queued discovery notification TRACE [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received MOM object discovery for federate [MOM.Federate(Sagitarii)] DEBUG [ImmediateCallbackDispatcher] portico.lrc: DISCARD Discovery of object (not subscribed): object=1 TRACE [localhost-startStop-1] portico.lrc: joined federation, waiting for RoleCalls from [1] INFO [localhost-startStop-1] portico.lrc: SUCCESS Joined federate [Sagitarii] to federation [Sagitarii]: handle=1 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=3949, source=AKRAB-19517 DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Publish [744] with attributes [745] TRACE [localhost-startStop-1] portico.lrc: NOTICE Implicitly adding privToDelete INFO [localhost-startStop-1] portico.lrc: SUCCESS Published [744] with attributes [502, 745] TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=402, source=AKRAB-19517 DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Register instance of class [744] , name=Sagitarii Server INFO [localhost-startStop-1] portico.lrc: SUCCESS Register instance of class [744], object handle=2 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=290, source=AKRAB-19517 DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Update object [2], attributes 745 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=108, source=AKRAB-19517 INFO [localhost-startStop-1] portico.lrc: SUCCESS Updated object [2], attributes 745 DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Subscribe to [746] with attributes [752, 753, 747, 748, 749, 750, 751] INFO [localhost-startStop-1] portico.lrc: SUCCESS Subscribeed to [746] with attributes [752, 753, 747, 748, 749, 750, 751] TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=483, source=AKRAB-19517 DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Subscribe to [736] with attributes [737, 738, 739, 740, 741, 742, 743] INFO [localhost-startStop-1] portico.lrc: SUCCESS Subscribeed to [736] with attributes [737, 738, 739, 740, 741, 742, 743] TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=483, source=AKRAB-19517 DEBUG [localhost-startStop-1] portico.lrc: ATTEMPT Subscribe to [2] with attributes [5, 7, 14] TRACE [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received MOM object discovery for federate [MOM.Federate(Sagitarii)] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [1] registeredAs=2, discoveredAs=2 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=1,class=2,name=MOM.Federate(Sagitarii)) DEBUG [localhost-startStop-1] portico.lrc: Queued Discover callback for instance [1] after subscription to class [2] TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete INFO [localhost-startStop-1] portico.lrc: SUCCESS Subscribeed to [2] with attributes [5, 7, 14] TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=443, source=AKRAB-19517

Now, lets start the Federate 02 ( Teapot Node ) in same machine ( AKRAB ) : This is the log result added to the above ( All log output printed here came from Federate 01 ).

TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=0, source=AKRAB-51112 DEBUG [Regular] portico.lrc.jgroups: (GMS) findCoordinator(AKRAB-51112) DEBUG [Regular] portico.lrc.jgroups: Received request for manifest from 4e06bacb-2ee7-4ec6-901b-92fe3e48acaa TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=21442, source=AKRAB-19517 DEBUG [Regular] portico.lrc.jgroups: (GMS) setManifest(AKRAB-19517) DEBUG [Regular] portico.lrc.jgroups: Sent manifest (21442b) to 4e06bacb-2ee7-4ec6-901b-92fe3e48acaa TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=29, source=AKRAB-51112 DEBUG [Regular] portico.lrc.jgroups: (GMS) joinFederation(AKRAB-51112) DEBUG [Regular] portico.lrc.jgroups: Received federate join notification: federate=Teapot Node 00-13-46-94-18-C1, federation=Sagitarii, source=4e06bacb-2ee7-4ec6-901b-92fe3e48acaa INFO [Regular] portico.lrc.jgroups: Federate [Teapot Node 00-13-46-94-18-C1] joined federation [Sagitarii] TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=3754, source=AKRAB-51112 DEBUG [Regular] portico.lrc: @REMOTE RoleCall received [handle:2,name:Teapot Node 00-13-46-94-18-C1] by local federate [Sagitarii] TRACE [Regular] portico.lrc: Created Mom.Federate(Teapot Node 00-13-46-94-18-C1), queued discovery notification DEBUG [Regular] portico.lrc: Merging 1 additional FOM modules from [Teapot Node 00-13-46-94-18-C1] TRACE [Regular] portico.lrc.merger: Beginning merge of 2 FOM models TRACE [Regular] portico.lrc.merger: Merging [file:/F:/sagitarii2/teapot/foms/SagitariiFederation.xml] into combined FOM TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=21866, source=AKRAB-19517 TRACE [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received MOM object discovery for federate [MOM.Federate(Teapot Node 00-13-46-94-18-C1)] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097151] registeredAs=2, discoveredAs=2 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097151,class=2,name=MOM.Federate(Teapot Node 00-13-46-94-18-C1)) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=574, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=462, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=309, source=AKRAB-51112 DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Federate [Teapot Node 00-13-46-94-18-C1] published object class [746] with attributes [752, 753, 502, 747, 748, 749, 750, 751] DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097153, class=746, owned=[753, 752, 749, 748, 502, 747, 751, 750] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097153] registeredAs=746, discoveredAs=746 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097153,class=746,name=Teapot Node) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=218, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=462, source=AKRAB-51112 DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Received object UPDATE [2097153] with attributes 752, 753, 747, 748, 749, 750, 751 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112 TRACE [Regular] portico.lrc.jgroups: (incoming) asynchronous, channel=Sagitarii, size=308, source=AKRAB-51112 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK reflectAttributeValues(object=2097153,attributes={752(40b),753(16b),747(8b),748(4b),749(8b),750(8b),751(28b)}) (RO) TeapotObject:setFreeMemory: 8001520 TeapotObject:setCpuLoad: 11.0 TeapotObject:setTotalMemory: 16252928 TRACE [ImmediateCallbackDispatcher] portico.lrc: reflectAttributeValues() callback complete DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Federate [Teapot Node 00-13-46-94-18-C1] published object class [736] with attributes [737, 738, 739, 740, 741, 742, 502, 743] DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097154, class=736, owned=[741, 739, 740, 742, 743, 737, 502, 738] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097154] registeredAs=736, discoveredAs=736 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097154,class=736,name=HLA2097154) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097155, class=736, owned=[740, 742, 743, 738, 741, 502, 737, 739] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097155] registeredAs=736, discoveredAs=736 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097155,class=736,name=HLA2097155) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097156, class=736, owned=[742, 739, 741, 743, 738, 740, 502, 737] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097156] registeredAs=736, discoveredAs=736 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097156,class=736,name=HLA2097156) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097157, class=736, owned=[740, 738, 742, 502, 743, 739, 741, 737] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097157] registeredAs=736, discoveredAs=736 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097157,class=736,name=HLA2097157) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097158, class=736, owned=[737, 740, 738, 742, 741, 739, 743, 502] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097158] registeredAs=736, discoveredAs=736 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097158,class=736,name=HLA2097158) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete DEBUG [ImmediateCallbackDispatcher] portico.lrc: @REMOTE Discover object: owner=Teapot Node 00-13-46-94-18-C1, object=2097159, class=736, owned=[739, 743, 741, 502, 742, 738, 737, 740] INFO [ImmediateCallbackDispatcher] portico.lrc: DISCOVER object [2097159] registeredAs=736, discoveredAs=736 TRACE [ImmediateCallbackDispatcher] portico.lrc: CALLBACK discoverObjectInstance(object=2097159,class=736,name=HLA2097159) TRACE [ImmediateCallbackDispatcher] portico.lrc: discoverObjectInstance() callback complete

I see all normal. Federate 01 log is telling me Federate 02 is in the game.

Now, let's simulate the Federate 02 ( Teapot Node ) crash by simply closing your running terminal. A big bada-boom!

Terminal closed. I don't give time to Federate 02 to resign or even know what happen.

At this time, I expected to see some log in Federate 01 telling me the epic dead of Federate 02.. but not even a line of log was printed.... none. Federate 01 ( and the RTI ) still thinking Federate 02 is still alive. I can proof this by trying to start Federate 02 again... a name conflict error raises on Federate 02 screen.

— Reply to this email directly or view it on GitHubhttps://github.com/openlvc/portico/issues/162#issuecomment-178263892.

This message and any attachments are confidential and commercial-in-confidence, and are intended solely for the use of the individual or entity to whom the message is addressed. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any use, dissemination, forwarding, printing or copying of this message and any file attachments is strictly prohibited. Any views or opinions presented are solely those of the author and do not necessarily represent those of Calytrix Technologies. If you have received this message in error, please immediately notify us by reply message to the sender.

kataner83 commented 8 years ago

Hi everyone, i would like to mention that besides the mentioned things here i believe that "crashed federates are not properly resigned from the federation they are running in.

Greetings Volker

icemagno commented 8 years ago

Yeap! I think this is an important feature/behaviour to keep the federation integrity. The existence of a federate cannot be left in doubt.

timpokorny commented 8 years ago

Completely agree - what should happen at the moment is that there should be a detection that the federate is no longer present (we use a couple of things under the hood for this). Then a resign should be synthesized for all federates. I believe the failure detection is currently getting screwed up. Looking into it now.

timpokorny commented 8 years ago

Looking at this over the weekend. I can see the messages that JGroups passes when a federate is suspected of crashing coming through, but for some reason the Group Management Services don't seem to be acting on that to remove the federate, which would then trigger the pseudo resign notification. Looking further.

timpokorny commented 8 years ago

Have identified a fix. Bug is due to changes in group management as a result of adding WAN support. Getting ready to push the fix but a couple of tests have broken so investigating those first.

timpokorny commented 8 years ago

Other problem identified and fixed. Patches pushed to master. Checking main build now.

timpokorny commented 8 years ago

master build passing. Will get @icemagno to confirm, but this should be fixed now.

icemagno commented 8 years ago

Thanks. I'll try it ASAP.

icemagno commented 8 years ago

Perfect! Its working fine. but ... Can it be faster? I need to wait near 5 seconds to be "rude" and 5 seconds more to the resign message be fired in the ambassador.

timpokorny commented 8 years ago

I'll have a looksie, but it's all a balance. It needs some time to determine whether the absence of the federate is actually a failure or whether it is just a delay (perhaps delayed send due to cpu exhaustion, retrans required due to network loss, etc...). Currently very conservative.

On 30 Apr 2016, at 12:00 AM, Carlos Magno Oliveira de Abreu notifications@github.com<mailto:notifications@github.com> wrote:

Perfect! Its working fine. but ... Can it be faster? I need to wait near 5 seconds to be "rude" and 5 seconds more to the resign message be fired in the ambassador.

— You are receiving this because you modified the open/close state. Reply to this email directly or view it on GitHubhttps://github.com/openlvc/portico/issues/162#issuecomment-215778266

This message and any attachments are confidential and commercial-in-confidence, and are intended solely for the use of the individual or entity to whom the message is addressed. If you are not the intended recipient or the person responsible for delivering the message to the intended recipient, be advised that you have received this message in error and that any use, dissemination, forwarding, printing or copying of this message and any file attachments is strictly prohibited. Any views or opinions presented are solely those of the author and do not necessarily represent those of Calytrix Technologies. If you have received this message in error, please immediately notify us by reply message to the sender.

icemagno commented 8 years ago

Oh. Understood. Nevermind...