chaos-mesh / chaosd

A Chaos Engineering toolkit.
Apache License 2.0
133 stars 63 forks source link

Network attack prevents API response #248

Open senges opened 1 year ago

senges commented 1 year ago

Context:

I want to create a network attack that sets ens21 down for 3 minutes, using chaosd in server mode with chaos-mesh CRD integration.

Running naive version with chaosd CLI works:

$ chaosd attack network down -d "ens21" --duration "3m"

Then, here is a simple version of the attack manifest:

kind: PhysicalMachineChaos
apiVersion: chaos-mesh.org/v1alpha1
metadata: {}
spec:
  action: network-down
  duration: 4m
  selector: {}
  network-down:
    device: ens21
    duration: 3m

If I provision this CRD in my k8s cluster, here is what's gonna happen:

Failed to apply chaos: : Post "http://10.52.0.205:31767/api/attack/network": context deadline exceeded (Client.Timeout exceeded while awaiting headers)

I believe this is caused in server.go#L160 by ExecuteAttack() being called before function returns http.StatusOK. This behavior works for most of the chaos but this one is a bit different as it has its own "duration" and is autonomous on recovering.

Moreover, note that if spec.duration > spec.network-down.duration a retry will be sent to the host API during the spare time between the two "duration" (4th minute in the manifest above). And this second time HTTP will return HTTP 200 and a second network disruption will be triggered.