What is the importance of having a HA infrastructure?
a High Availability infrastructure is essential for ensuring that IT services are consistently accessible, reliable, and efficient, thus supporting overall business goals and protecting against disruptions.
If we can afford 1s of downtime per day, what's our availabilty percantage for a year? explain.
1 day = 86400 second so availability percantage is : 100*(1-(1/86400)) = 99.9988...
What is failover and how can it help us achive HA?
Failover refers to the process of automatically switching to a standby or redundant system, component, or network upon the failure or abnormal termination of the currently active one. it can decrease or eradicate downtime.
What are some methods to ensure HA and DR in our systems?
Failover TestingLoad TestingGraceful Degradation TestingPartial/Full Failover TestingChaos Engineering
Explain chaos engineering and name some tools that help us with implementing it.
The primary goal of chaos engineering is to identify weaknesses in a system, assess how it behaves under stress, and ensure that it can withstand unexpected conditions in production environments. This helps organizations proactively improve their systems before real-world incidents occur. -Pumba -Chaos Toolkit -LitmusChaos
a High Availability infrastructure is essential for ensuring that IT services are consistently accessible, reliable, and efficient, thus supporting overall business goals and protecting against disruptions.
1 day = 86400 second so availability percantage is : 100*(1-(1/86400)) = 99.9988...
Failover refers to the process of automatically switching to a standby or redundant system, component, or network upon the failure or abnormal termination of the currently active one. it can decrease or eradicate downtime.
Failover Testing
Load Testing
Graceful Degradation Testing
Partial/Full Failover Testing
Chaos Engineering
The primary goal of chaos engineering is to identify weaknesses in a system, assess how it behaves under stress, and ensure that it can withstand unexpected conditions in production environments. This helps organizations proactively improve their systems before real-world incidents occur. -Pumba -Chaos Toolkit -LitmusChaos