actionjack / so-you-want-to-onboard-a-devops-engineer

Guidance on how to make your environment easier to onboard for Web Ops Engineers, SRE's and DevOps Practitioners
Creative Commons Attribution 4.0 International
279 stars 29 forks source link

Culture: Lessons from Failure #172

Open actionjack opened 4 years ago

actionjack commented 4 years ago

Who has ever dropped a production database?

actionjack commented 4 years ago

Made a change to a production system that was outside the scheduled release window because you thought that it wouldn't be an issue and it affected the app adversely..

actionjack commented 1 month ago

How GitOps Provides Effective Safeguards: A Lesson from the Trenches

In the fast-paced world of software development, mistakes can happen—even to the best of us. I recall an incident from my past where I accidentally dropped a database in production. It was a costly error that led to downtime, stress, and a lot of remediation work. This incident could have been highly unlikely if we had been using GitOps with the proper safeguards in place.

What is GitOps?

GitOps is a modern approach to continuous delivery and operational management using Git as the single source of truth. By leveraging Git repositories for both application and infrastructure code, GitOps provides a declarative way to manage infrastructure and applications. Changes are made through pull requests, ensuring that every change is tracked, auditable, and easily reversible.

The Power of GitOps Safeguards

1. Immutable History: With GitOps, every change to the system is version-controlled. This immutable history ensures that you can always trace back who made what change and when. In the event of an issue, you can quickly identify the root cause and roll back to a previous stable state.

2. Pull Request Workflow: Changes in GitOps are made through pull requests, which must be reviewed and approved before being merged. This workflow adds an extra layer of scrutiny, reducing the likelihood of accidental changes making it to production. In my case, dropping the database might have been flagged and prevented by a peer review.

3. Automated Deployments: GitOps integrates with CI/CD pipelines to automate deployments. Once a pull request is merged, automated tests and deployment scripts ensure that changes are applied consistently and correctly across environments. This reduces the risk of manual errors and ensures that safeguards are consistently applied.

4. Environment Parity: GitOps promotes environment parity by using the same codebase and configuration across all environments. This means that what works in staging is guaranteed to work in production. If we had been using GitOps, the database changes would have been tested in staging first, reducing the risk of issues in production.

5. Policy Enforcement: GitOps can integrate with policy enforcement tools to ensure that changes comply with organizational policies and best practices. This can include safeguards against dangerous operations like dropping databases in production. Automated checks can enforce these policies before changes are merged.

My Experience: A Hard Lesson Learned

Dropping a database in production was a harsh reminder of the importance of safeguards in our deployment processes. The incident caused significant disruption and underscored the need for better practices. Implementing GitOps could have prevented this by:

Conclusion

GitOps offers a robust framework for managing infrastructure and applications with greater reliability and security. By leveraging the power of Git as a single source of truth and implementing stringent safeguards, organizations can significantly reduce the risk of costly mistakes.

If you haven’t yet explored GitOps, now is the time. It can transform your deployment processes, enhance collaboration, and provide the peace of mind that comes with knowing your systems are protected by effective safeguards.

Have you had any experiences where GitOps could have saved the day? Share your stories and insights in the comments!

GitOps #DevOps #ContinuousDelivery #InfrastructureAsCode #SoftwareDevelopment #TechLeadership