cncf / tag-app-delivery

📨🚚CNCF App Delivery TAG
https://tag-app-delivery.cncf.io
Apache License 2.0
766 stars 201 forks source link

Propose chaos-mesh as a CNCF sandbox project #23

Closed dcalvin closed 4 years ago

dcalvin commented 4 years ago

Hi, we would like to propose Chaos Mesh as a CNCF sandbox project. We have some minor pending details to fill in or update, but basically that's it.

I also created the PR for the proposal in TOC repo:

https://github.com/cncf/toc/pull/367

Name of Project: Chaos Mesh

Description:

Chaos Mesh is a versatile Chaos Engineering platform that orchestrates chaos experiments on Kubernetes environments. It features all-around fault injection methods for complex systems on Kubernetes, covering faults in Pod, network, file system, and even the kernel.

Chaos Mesh was originated from the internal chaos engineering platform at PingCAP to ensure the resilience of TiDB, its distributed database system. As the system evolves and testing requirements multiply, the team realized that they need an easy to use, scalable and universal chaos testing platform. The combination of chaos and Kubernetes became the natural choice. Chaos Mesh uses CRD to define chaos objects, which makes it naturally integrate with the Kubernetes ecosystem.

At the current stage, it supports the following fault injection types:

Besides the versatile chaos types, it also offers Chaos Dashboard (under development), a visualized panel that shows the impacts of chaos experiments on the online services of the system, which makes chaos tests easily observable and manageable.

Sponsor from TOC: TBD

Unique Identifier: chaos-mesh

Preferred Maturity Level: Sandbox

License: Apache License, Version 2.0

Source control repositories: https://github.com/pingcap/chaos-mesh (to be moved to https://github.com/chaos-mesh)

Issue tracker: GitHub

Infrastructure Required:

Chaos Mesh uses in-house Jenkins CI cluster for integration tests. We plan to use CNCF test cluster to automatically run stability tests and performance tests in the future.

Website: https://github.com/pingcap/chaos-mesh

Documentation: https://github.com/pingcap/chaos-mesh/wiki

Release methodology and mechanics:

This is currently being defined. Releases every few months with RC process.

External dependencies (including licenses):

BSD:

Initial committers:

Name Email Focus
Siddon Tang tl@pingcap.com Project Lead
Qiang Zhou zhouqiang@pingcap.com Project Lead
CWen cwen@pingcap.com Operator, Dashboard
YangKeao yangkeao@pingcap.com Operator, Dashboard

Community Stats:

Although it’s just been open-sourced since Dec 31, 2019, Chaos Mesh has been gaining recognition and popularity quickly in the community, with 1400+ stars received on GitHub in a month. It has also been listed under CNCF Cloud Native Interactive Landscape.

Comparison

This comparison is intended simply to compare fault injection features supported by Chaos Mesh with other well-known chaos engineering platforms. It is not intended to favor or position one project over another. Any corrections are welcome.

chaos-mesh chaosmonkey chaosblade chaoskube Litmus
Platform supported K8s VMs/ Container JVM/Container/K8s K8s K8s
CPU burn N N Y N Y
Mem burn N N Y N Y
container kill Y Y Y N Y
pod failure Y N N N Y
pod kill Y N Y Y Y
network partition Y N N N N
network duplication Y N N N N
network corrupt Y N Y N Y
network loss Y N Y N Y
network delay Y N Y N Y
I/O delay Y N N N N
I/O errno Y N N N N
Time skew Y N N N N

Roadmap:

See https://github.com/pingcap/chaos-mesh/blob/master/ROADMAP.md

Social media accounts

Twitter: @chaos_mesh

Communication channels

GitHub

Slack channel: #sig-chaos-mesh

Community meetings (Planing)

Adopters or potential users?

TiKV/TiDB projects
Netease (testing)

Existing sponsorship

PingCAP

Statement on alignment with CNCF charter mission

Our team believes Chaos Mesh will be a great fit for CNCF. As manifested in the CNCF mission:

““These techniques enable loosely coupled systems that are resilient, manageable, and observable. Combined with robust automation, they allow engineers to make high-impact changes frequently and predictably with minimal toil. “

We believe Chaos Mesh is one of the essential enablements to this mission, and it’s also a great addition to the sandbox project scope. By covering comprehensive fault injection methods in Pod, network, file system, and even the kernel, Chaos Mesh aims at providing a neutral, universal Chaos Engineering platform that enables cloud-native applications to be as resilient as they should be. Chaos Mesh uses CRD to define chaos objects, making it naturally integrated with the Kubernetes ecosystem. In addition, it integrates several other projects in the cloud-native ecosystem, such as Helm, Prometheus, and Grafana.

What are we looking for from CNCF:

amye commented 4 years ago

Closing as this is in SIG-Network's land.