OpenClovis / SAFplus-Availability-Scalability-Platform

Middleware that provides libraries, GUI, and code generator to design multi-node (clustered) applications that are highly available, redundant, and scalable. Provides sub-second node and application fault detection and failover, and useful application libraries including distributed hash tables (checkpoint), event, logging, and communications. Implements SA-Forum APIs where applicable. Used anywhere reliability is a must -- like telecom, wireless, defense and enterprise computing. Download stable release with installer from: ftp.openclovis.com
www.openclovis.com
GNU General Public License v2.0
19 stars 13 forks source link

./etc/init.d/safplus start then ./etc/init.d/safplus stop failed #97

Closed hoangle closed 11 years ago

hoangle commented 11 years ago

Mon Jul 22 15:26:45.318 2013 (scI0I0.2343 : AMF.---.---.01133 : ALERT) Terminating CompName [nameServer_scI0I0] via eoPort [0x5] Mon Jul 22 15:26:45.319 2013 (scI0I0.2343 : AMF.---.---.01138 : ALERT) Terminating CompName [ckptServer_scI0I0] via eoPort [0xc] Mon Jul 22 15:26:45.320 2013 (scI0I0.2423 : CKP.MDP.PDN.00790 : NOTICE) Changing the address from [1] to [16777214] for checkpoint [clLogStreamOwnerGlobalCkpt] Mon Jul 22 15:26:45.321 2013 (scI0I0.2380 : LOG.CLN.---.01102 : NOTICE) Received address change event for checkpoint [clLogStreamOwnerGlobalCkpt] with address [16777214] Mon Jul 22 15:26:45.322 2013 (scI0I0.2423 : CKP.MDP.PDN.00795 : NOTICE) Changing the address from [1] to [16777214] for checkpoint [clLogMasterCkpt] Mon Jul 22 15:26:45.322 2013 (scI0I0.2380 : LOG.CLN.---.01110 : NOTICE) Received address change event for checkpoint [clLogMasterCkpt] with address [16777214] Mon Jul 22 15:26:45.326 2013 (scI0I0.2343 : AMF.EVT.CLEANUP.01156 : NOTICE) Event cleanup done for component [ckptServer_scI0I0], compId [0x1000a], eoId [0x5], eoPort [0xc] Mon Jul 22 15:26:45.334 2013 (scI0I0.2422 : NAM.SVR.SHU.00149 : NOTICE) Name service has been finalized successfully Mon Jul 22 15:26:45.335 2013 (scI0I0.2422 : NAM._EO.WRK.00153 : NOTICE) EO [NAM] unblocked and exiting Mon Jul 22 15:26:45.338 2013 (scI0I0.2343 : AMF.EVT.CLEANUP.01179 : NOTICE) Event cleanup done for component [nameServer_scI0I0], compId [0x10009], eoId [0x4], eoPort [0x5] Mon Jul 22 15:26:45.338 2013 (scI0I0.2414 : EVT.SRV.CLN.01387 : ERROR) clHandleDestroy failed, rc[0X300005] Mon Jul 22 15:26:45.339 2013 (scI0I0.2414 : EVT.SRV.CLN.01388 : ERROR) Unbsubscribe All failed, rc[0X130005] Mon Jul 22 15:26:45.339 2013 (scI0I0.2423 : CKP.EVT.FIN.00824 : ERROR) Event Finalize Failed for EO{port[0xc], evtHandle[0x1000e00000001]} rc[0x130005] Mon Jul 22 15:26:45.339 2013 (scI0I0.2423 : CKP.EVT.FIN.00825 : ERROR) Event Finalize failed, rc[0X130005] Mon Jul 22 15:26:45.341 2013 (scI0I0.2423 : CKP._EO.WRK.00834 : NOTICE) EO [CKP] unblocked and exiting Mon Jul 22 15:26:45.352 2013 (scI0I0.2343 : AMF.---.---.01215 : ALERT) Terminating CompName [eventServer_scI0I0] via eoPort [0x2] Mon Jul 22 15:26:45.353 2013 (scI0I0.2414 : EVT.HDL.HBD.01445 : WARN) Handle [0x9ebd48:0X2] has not been cleaned, destroying... Mon Jul 22 15:26:45.353 2013 (scI0I0.2414 : EVT.HDL.HBD.01446 : WARN) Handle [0x9ebd48:0X3] has not been cleaned, destroying... Mon Jul 22 15:26:45.354 2013 (scI0I0.2414 : EVT.HDL.HBD.01447 : WARN) Handle [0x9ebd48:0X5] has not been cleaned, destroying... Mon Jul 22 15:26:45.355 2013 (scI0I0.2414 : EVT._EO.WRK.01450 : NOTICE) EO [EVT] unblocked and exiting Mon Jul 22 15:26:45.355 2013 (scI0I0.2343 : AMF.---.---.01232 : ALERT) Terminating CompName [gmsServer_scI0I0] via eoPort [0x9] Mon Jul 22 15:26:45.356 2013 (scI0I0.2394 : GMS.GEN.---.00223 : CRITIC) Server Got Termination Request. Started Shutting Down... Mon Jul 22 15:26:45.358 2013 (scI0I0.2394 : GMS._EO.WRK.00229 : NOTICE) EO [GMS] unblocked and exiting Mon Jul 22 15:26:45.358 2013 (scI0I0.2394 : GMS.GEN.---.00230 : CRITIC) GMS server exiting Mon Jul 22 15:26:45.359 2013 (scI0I0.2343 : AMF.---.---.01249 : ALERT) Terminating CompName [logServer_scI0I0] via eoPort [0x4] Mon Jul 22 15:26:45.362 2013 (scI0I0.2343 : AMF.HDL.---.01261 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:45.364 2013 (scI0I0.2380 : LOG.---.---.01174 : ERROR) clLogSvrShmAndFlusherClose(): rc[0x b0004] Mon Jul 22 15:26:45.855 2013 (scI0I0.2343 : AMF.HDL.---.01266 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:45.855 2013 (scI0I0.2486 : step2_log_EO.XPORT.FIN.00274 : NOTICE) Inside fake transport finalize Mon Jul 22 15:26:46.368 2013 (scI0I0.2380 : LOG._EO.RCV.01181 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:46.842 2013 (scI0I0.2423 : CKP.HDL.HBD.00837 : WARN) Handle [0x1dfc688:0X2] has not been cleaned, destroying... Mon Jul 22 15:26:46.939 2013 (scI0I0.2422 : NAM.XPORT.FIN.00163 : NOTICE) Inside fake transport finalize Mon Jul 22 15:26:46.939 2013 (scI0I0.2343 : AMF.HDL.---.01271 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:47.369 2013 (scI0I0.2380 : LOG._EO.RCV.01183 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:47.415 2013 (scI0I0.2456 : MSG.XPORT.FIN.00150 : NOTICE) Inside fake transport finalize Mon Jul 22 15:26:47.415 2013 (scI0I0.2343 : AMF.HDL.---.01276 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:47.445 2013 (scI0I0.2423 : CKP.XPORT.FIN.00845 : NOTICE) Inside fake transport finalize Mon Jul 22 15:26:47.446 2013 (scI0I0.2343 : AMF.HDL.---.01281 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:47.459 2013 (scI0I0.2414 : EVT.XPORT.FIN.01459 : NOTICE) Inside fake transport finalize Mon Jul 22 15:26:47.460 2013 (scI0I0.2343 : AMF.HDL.---.01286 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:48.369 2013 (scI0I0.2380 : LOG._EO.RCV.01187 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:49.369 2013 (scI0I0.2380 : LOG._EO.RCV.01188 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:50.370 2013 (scI0I0.2380 : LOG._EO.RCV.01189 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:51.374 2013 (scI0I0.2380 : LOG._EO.RCV.01190 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:52.375 2013 (scI0I0.2380 : LOG._EO.RCV.01191 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:53.375 2013 (scI0I0.2380 : LOG._EO.RCV.01192 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:54.375 2013 (scI0I0.2380 : LOG._EO.RCV.01193 : ERROR) Retry RMD function [0.11] lookup due to error 4. Mon Jul 22 15:26:55.366 2013 (scI0I0.2343 : AMF.---.---.01287 : ERROR) Component [logServer_scI0I0] did not terminated within the specified limit Mon Jul 22 15:26:55.366 2013 (scI0I0.2343 : AMF.---.---.01288 : ERROR) Component logServer_scI0I0 did not terminate within the specified limit

Mon Jul 22 15:26:55.371 2013 (scI0I0.2343 : AMF.HDL.---.01296 : CRITIC) This DB handle [0xea1668] is corrupt, MD check failed Mon Jul 22 15:26:56.297 2013 (scI0I0.2343 : AMF._EO.WRK.01299 : NOTICE) EO [AMF] unblocked and exiting

CangTranOC commented 11 years ago

Root cause: clAmsCCBHandleDBCleanup() still be called on comp down after clAmsFinalize(). This leak to calling clHandleWalk(gAms.ccbHandleDB, ...) after gAms.ccbHandleDB is destroyed.

Solution: check if amf finalized, exit function clAmsCCBHandleDBCleanup()