cockroachdb / cockroach

CockroachDB — the cloud native, distributed SQL database designed for high availability, effortless scale, and control over data placement.
https://www.cockroachlabs.com
Other
30.18k stars 3.82k forks source link

roachtest: perturbation test for intents #135969

Open andrewbaptist opened 3 days ago

andrewbaptist commented 3 days ago

Adds the test perturbation/*/intents which stresses adding empty rows with intents to a cluster.

Fixes: https://github.com/cockroachdb/cockroach/issues/135937

Release note: None

cockroach-teamcity commented 3 days ago

This change is Reviewable

andrewbaptist commented 3 days ago

@aadityasondhi This isn't ready yet, but something close to this will likely work. I'm not exactly of the impact of this so will have to test a bit first.

andrewbaptist commented 2 days ago

From a run https://go.crdb.dev/roachtest-grafana/baptist-cockroachlabs-com-1732315103/perturbation-full-intents/1732315223012/1732317714477

2024/11/22 23:12:51 framework.go:642: validating stats during the perturbation
2024/11/22 23:12:51 framework.go:717: PASSED : follower-read  : Increase 1.2700 <= 20.0000 BASE: 5.907973ms SCORE: 7.503314ms
2024/11/22 23:12:51 framework.go:717: PASSED : read           : Increase 1.2875 <= 20.0000 BASE: 5.712685ms SCORE: 7.355313ms
2024/11/22 23:12:51 framework.go:717: PASSED : write          : Increase 1.1416 <= 20.0000 BASE: 6.116694ms SCORE: 6.983089ms
2024/11/22 23:12:51 framework.go:644: validating stats after the perturbation
2024/11/22 23:12:51 framework.go:717: PASSED : follower-read  : Increase 9.1479 <= 20.0000 BASE: 5.907973ms SCORE: 54.045436ms
2024/11/22 23:12:51 framework.go:717: PASSED : read           : Increase 11.4126 <= 20.0000 BASE: 5.712685ms SCORE: 65.196569ms
2024/11/22 23:12:51 framework.go:717: PASSED : write          : Increase 7.2754 <= 20.0000 BASE: 6.116694ms SCORE: 44.501648ms

I was not able to reproduce IO overload with tihs test. It generates about 72 GiB of intents, but they are all cleaned up in about 20 seconds with minimal impact on storage. I'm wondering if there is a better way to do this to generate more IO overload.

Regardless I think this is a good test, and we could expand it over time.