Closed anighanta closed 3 years ago
Summary of the defect The reconnecting nodes won't go behind after joining back.
How to reproduce (if possible) Run test MixedOps-Global-ND-3NReconnect-6k-30m
MixedOps-Global-ND-3NReconnect-6k-30m
Service logs (if applicable)
2021-06-30 08:23:15.665 44 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: 'Signed state for round 3657' waiting for snapshotTask 2021-06-30 08:23:15.666 45 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: Finished writing 'Signed state for round 3657' to disk 2021-06-30 08:40:51.210 1 DEBUG STARTUP <main> Browser: main() started {} [com.swirlds.logging.payloads.NodeStartPayload] 2021-06-30 08:40:51.269 2 DEBUG STARTUP <main> Browser: About to run startPlatforms() 2021-06-30 08:40:51.281 3 DEBUG STARTUP <main> Browser: About do crypto instantiation 2021-06-30 08:40:51.289 4 DEBUG STARTUP <main> Browser: About start loading keys 2021-06-30 08:40:51.597 5 DEBUG STARTUP <main> Browser: Done loading keys 2021-06-30 08:40:51.598 6 DEBUG STARTUP <main> Browser: About to start creating Crypto objects 2021-06-30 08:40:53.867 7 DEBUG STARTUP <main> Browser: Done creating Crypto objects 2021-06-30 08:40:53.867 8 DEBUG STARTUP <main> Browser: Done with crypto instantiation 2021-06-30 08:40:54.042 9 DEBUG STARTUP <main> Browser: Starting platforms 2021-06-30 08:40:54.049 10 DEBUG STARTUP <main> Browser: Scanning the classpath for RuntimeConstructable classes 2021-06-30 08:40:55.276 11 DEBUG STARTUP <main> Browser: Done with registerConstructables, time taken 1225ms 2021-06-30 08:41:05.586 12 DEBUG RECONNECT <main> FCMap: FCMap Initialized [ internalMapSize = 610734, treeSize = 610734 ] 2021-06-30 08:41:05.602 13 DEBUG RECONNECT <main> FCMap: FCMap Initialized [ internalMapSize = 1715, treeSize = 1715 ] 2021-06-30 08:41:07.651 14 DEBUG RECONNECT <main> FCMap: FCMap Initialized [ internalMapSize = 614703, treeSize = 614703 ] 2021-06-30 08:41:07.664 15 DEBUG RECONNECT <main> FCMap: FCMap Initialized [ internalMapSize = 2330, treeSize = 2330 ] 2021-06-30 08:41:10.171 16 DEBUG RECONNECT <main> FCMap: FCMap Initialized [ internalMapSize = 955557, treeSize = 955557 ] 2021-06-30 08:41:10.174 17 DEBUG RECONNECT <main> FCMap: FCMap Initialized [ internalMapSize = 0, treeSize = 0 ] 2021-06-30 08:41:18.174 18 DEBUG SNAPSHOT_MANAGER <main> SnapshotManager: SnapshotManager: Successfully queued snapshot request [taskType='RESTORE', applicationName='com.hedera.services.ServicesMain', worldId='123', nodeId=12, roundNumber=3657, snapshotId=restore, timeStarted=2021-06-30T08:41:18.164680Z, timeCompleted=null, complete=false, error=false ] 2021-06-30 08:41:18.356 19 DEBUG SNAPSHOT_MANAGER << SnapshotManager >> SnapshotManager: SnapshotManager: Process successfully executed for snapshot task [taskType='RESTORE', applicationName='com.hedera.services.ServicesMain', worldId='123', nodeId=12, roundNumber=3657, snapshotId=restore, timeStarted=2021-06-30T08:41:18.164680Z, timeCompleted=null, complete=false, error=false ] 2021-06-30 08:41:18.357 20 INFO SNAPSHOT_MANAGER << SnapshotManager >> SnapshotManager: SnapshotManager: Completed task [taskType='RESTORE', applicationName='com.hedera.services.ServicesMain', worldId='123', nodeId=12, roundNumber=3657, snapshotId=restore, timeStarted=2021-06-30T08:41:18.164680Z, timeCompleted=2021-06-30T08:41:18.357292Z, complete=true, error=false ] 2021-06-30 08:41:18.357 21 INFO STATE_TO_DISK <main> SignedStateFileManager: Deleted: '/home/services_nightly_regression/remoteExperiment/data/saved/com.hedera.services.ServicesMain/12/123/3656/settingsUsed.txt' 2021-06-30 08:41:18.411 22 INFO STATE_TO_DISK <main> SignedStateFileManager: Deleted: '/home/services_nightly_regression/remoteExperiment/data/saved/com.hedera.services.ServicesMain/12/123/3656/SignedState.swh' 2021-06-30 08:41:18.412 23 INFO STATE_TO_DISK <main> SignedStateFileManager: Deleted: '/home/services_nightly_regression/remoteExperiment/data/saved/com.hedera.services.ServicesMain/12/123/3656/PostgresBackup.tar.gz' 2021-06-30 08:41:18.412 24 INFO STATE_TO_DISK <main> SignedStateFileManager: Deleted: '/home/services_nightly_regression/remoteExperiment/data/saved/com.hedera.services.ServicesMain/12/123/3656' 2021-06-30 08:41:18.414 25 DEBUG STARTUP <main> BinaryObjectStore: Recovery Starting 2021-06-30 08:41:18.414 26 DEBUG STARTUP <main> BinaryObjectStore: Recovering Objects [ objectCount = 1715, uniqueObjectCount = 25 ] 2021-06-30 08:41:18.442 27 DEBUG STARTUP <main> BinaryObjectStore: Recovery Finished 2021-06-30 08:41:18.443 28 INFO STARTUP <main> SwirldsPlatform: startUpEventFrozenEndTime: 2021-06-30T08:41:28.443160Z 2021-06-30 08:41:18.444 29 INFO STARTUP <main> SwirldsPlatform: initialize eventStreamManager 2021-06-30 08:41:18.454 30 DEBUG STARTUP <main> SwirldsPlatform: Platform has loaded a saved state {"round":3657,"consensusTimestamp":{"nano":487192000,"epochSecond":1625041382},"willFreezeUntil":{"nano":443160000,"epochSecond":1625042488}} [com.swirlds.logging.payloads.SavedStateLoadedPayload] 2021-06-30 08:41:18.459 31 INFO STARTUP <main> EventFlow: doConsMinGenFamous after startup [(3646,55658), (3647,55671), (3648,55685), (3649,55699), (3650,55715), (3651,55730), (3652,55747), (3653,55762), (3654,55778), (3655,55792), (3656,55808), (3657,55845)] 2021-06-30 08:41:18.460 32 INFO STARTUP <main> EventFlow: initialHash after startup ecaeff83bbcf77218f1dfbcd01dc825baadc08f17e7661196f34037f95631f32381558c899221005a5343cd8daca6e7c 2021-06-30 08:41:18.484 33 DEBUG STARTUP <main> ConsensusImpl: ConsensusImpl is initialized from signed state. minRound: 3646(min gen = 55658), maxRound: 3657(max gen = 55845) 2021-06-30 08:41:18.487 34 INFO STARTUP <main> SwirldsPlatform: Last known sequence numbers after restart are [40362, 39739, 40091, 40604, 40531, 38722, 39942, 31840, 40523, 40563, 40288, 40300, 38717, 40447, 40290] 2021-06-30 08:41:18.494 35 INFO STARTUP <main> EventFlow: EventFlow.startAll(), forCons.size: 0, forCurr.size: 0, forNext.size: null, forSigs.size: 1664, forWork.size: null 2021-06-30 08:41:20.513 36 DEBUG STARTUP <main> Browser: jvmPauseDetectorThread started 2021-06-30 08:41:20.513 37 DEBUG STARTUP <main> Browser: Done with starting platforms 2021-06-30 08:41:20.531 38 DEBUG STARTUP <main> Browser: main() finished 2021-06-30 08:41:20.543 39 INFO STARTUP <<platform-core: CsvWriter 12 #0>> CsvWriter: CsvWriter: Initializing statistics output in CSV format [ writePeriod = '3000', csvOutputFolder = '/home/services_nightly_regression/remoteExperiment', csvFileName = 'ServicesStats12.csv' ] 2021-06-30 08:41:20.544 40 DEBUG STARTUP <<platform-core: CsvWriter 12 #0>> CsvWriter: CsvWriter: Using the existing stats folder [ folder = '/home/services_nightly_regression/remoteExperiment' ] 2021-06-30 08:41:20.630 41 INFO PLATFORM_STATUS <sync_server_2> SwirldsPlatform: Platform status changed. {"oldStatus":"","newStatus":"ACTIVE"} [com.swirlds.logging.payloads.PlatformStatusPayload] 2021-06-30 08:41:20.630 42 INFO PLATFORM_STATUS <sync_server_2> SwirldsPlatform: Platform status changed to: ACTIVE 2021-06-30 08:41:28.067 43 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: Started writing 'Signed state for round 3728' to disk 2021-06-30 08:41:28.069 44 DEBUG SNAPSHOT_MANAGER <<platform-core: objectFileManager 12 #0>> SnapshotManager: SnapshotManager: Successfully queued snapshot request [taskType='BACKUP', applicationName='com.hedera.services.ServicesMain', worldId='123', nodeId=12, roundNumber=3728, snapshotId=00000003-000001FE-1, timeStarted=2021-06-30T08:41:28.068281Z, timeCompleted=null, complete=false, error=false ] 2021-06-30 08:41:28.259 45 DEBUG SNAPSHOT_MANAGER << SnapshotManager >> SnapshotManager: SnapshotManager: Process successfully executed for snapshot task [taskType='BACKUP', applicationName='com.hedera.services.ServicesMain', worldId='123', nodeId=12, roundNumber=3728, snapshotId=00000003-000001FE-1, timeStarted=2021-06-30T08:41:28.068281Z, timeCompleted=null, complete=false, error=false ] 2021-06-30 08:41:28.260 46 INFO SNAPSHOT_MANAGER << SnapshotManager >> SnapshotManager: SnapshotManager: Completed task [taskType='BACKUP', applicationName='com.hedera.services.ServicesMain', worldId='123', nodeId=12, roundNumber=3728, snapshotId=00000003-000001FE-1, timeStarted=2021-06-30T08:41:28.068281Z, timeCompleted=2021-06-30T08:41:28.259741Z, complete=true, error=false ] 2021-06-30 08:41:43.165 47 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: Done writing saved state with HashEventsCons 7b1b3313e6561469d2d2270543a29ab750b23131475728cc78d7c2a8754d28aab965ef42dcfb5f88c7e36a4773eb1278, starting with local events 2021-06-30 08:41:43.167 48 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: 'Signed state for round 3728' waiting for snapshotTask 2021-06-30 08:41:43.168 49 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: Finished writing 'Signed state for round 3728' to disk 2021-06-30 08:41:58.380 50 INFO STATE_TO_DISK <<platform-core: objectFileManager 12 #0>> SignedStateFileManager: Started writing 'Signed state for round 3823' to disk
Possible Resolution decreasing "state.roundsExpired" from 500 might help
500
This issue is fixed and the passing result https://hedera-hashgraph.slack.com/archives/CKWHL8R9A/p1625475393462700
Summary of the defect The reconnecting nodes won't go behind after joining back.
How to reproduce (if possible) Run test
MixedOps-Global-ND-3NReconnect-6k-30m
Service logs (if applicable)
Possible Resolution decreasing "state.roundsExpired" from
500
might help