3box / keramik

A k8s operator for simulating Ceramic networks
Other
5 stars 1 forks source link

chore: Less flaky results from scenario validation #166

Closed dav1do closed 4 months ago

dav1do commented 4 months ago

We keep getting failures during longer simulations because the manager and workers disconnect at some point, and no metrics are reported by goose, however the prometheus metrics demonstrate that we would have still passed. Now we rely on goose for time, but keep a backup (this will be slightly longer and therefore won't bias us toward a better result), and use the prom metrics exclusively.

I also deleted the old recon keys only scenario as it no longer applies and is just noise.

Results from a 3 minute local test:

image