Open polaris-alioth opened 6 months ago
The issue appears to have been triggered by the activation of the sentinel.tilt mode, which subsequently led to a failure to perform failover correctly. When Sentinel is unable to complete certain critical tasks within a specified time frame, it enters tilt mode. These critical tasks include communication with other Sentinel nodes, checking the status of the primary and replica nodes, etc. From the logs, it seems that the node was oscillating between (subjective down) and (objective down).
code :
void sentinelCheckTiltCondition(void) {
/* Check if we need to enter the TILT mode. */
if (!sentinel.tilt) {
if (mstime() - sentinel.tilt_start_time >= SENTINEL_TILT_PERIOD) {
sentinelEvent(LL_WARNING,"-tilt",NULL,"#tilt mode exited");
sentinel.tilt = 0;
}
}
/* Check if we need to exit the TILT mode. */
if (sentinel.tilt) {
if (mstime() - sentinel.tilt_start_time < SENTINEL_TILT_TRIGGER) {
sentinelEvent(LL_WARNING,"+tilt",NULL,"#tilt mode entered");
sentinel.tilt = 1;
sentinel.tilt_start_time = mstime();
}
}
}
version: 6.2.7 Deployment: 1 master 2salve 3sentinel
question The memory and CPU of the master node are full. The client cannot be connected. replica does not switch to maste.
sentinel 2 log
sentinel 3 log