Open lmatz opened 1 year ago
If enough people agree with this approach, I will investigate what syntax we can learn from chaos mesh
and draft a specification.
Not necessarily yaml
though
LGTM
LGTM. I'm trying to merge the recovery test and scale test into one crate and then provide a unified configuration.
Related: #6485
Hey, any updates
No update yet 🥵
We have two deterministic simulation tests, i.e. recovery tests and scale tests, where we ingest special behaviors:
node-killing
reschedule
Currently,
We can mimic the way how
chaos mesh
specifies all kinds of faults(OS level) to specify these special behaviors(from Risingwave), e.g. viayaml
file, such as https://chaos-mesh.org/docs/simulate-pod-chaos-on-kubernetes/#pod-failure-example.I imagine we can still specify some particular sequences of certain behaviors, or have some pre-determined
chaos
generator to randomly generate instructions.The benefits:
Although we can probably mimic
node-killing
in chaos mesh byprocess crash
(not forreschedule
because this requires instructions from RW),chaos mesh
runs things in the real world while simulation tests run in the simulated world.So just
chaos mesh
is not enough.Occurred to me when thinking #6369.