WIP: etcdserver: separate "raft log compact" from snapshot

etcd-io / etcd

Distributed reliable key-value store for the most critical data of a distributed system

https://etcd.io

Apache License 2.0

46.77k stars 9.64k forks source link

WIP: etcdserver: separate "raft log compact" from snapshot #18235

Open clement2026 opened 3 days ago

clement2026 commented 3 days ago

I'm looking into https://github.com/etcd-io/etcd/issues/17098 @serathius and made changes to separate the compact logic from the snapshot logic.

I ran rw-benchmark.sh several times to observe performance changes. I expected this patch to slow down etcd, but it seems to have improved rw-benchmark.sh results. I’m not sure how. Similar results were observed on both cloud VMs (8vCPU, 16GB Memory, NVMe) and a bare-metal machine (8CPU, 32GB Memory, SSD).

Since the rw-benchmark.sh script was taking forever to finish, I adjusted the parameters to complete the benchmark within a few hours.

Below are the script and the results from running it on a Vultr bare-metal machine (8 CPUs, 32GB Memory, SSD, Ubuntu 24.04 LTS x64).

export RATIO_LIST="4/1"
export REPEAT_COUNT=3
export RUN_COUNT=50000
./rw-benchmark.sh

compare main.csv issue-17098-patch.csv

k8s-ci-robot commented 3 days ago

Hi @clement2026. Thanks for your PR.

I'm waiting for a etcd-io member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available [here](https://git.k8s.io/community/contributors/guide/pull-requests.md). If you have questions or suggestions related to my behavior, please file an issue against the [kubernetes-sigs/prow](https://github.com/kubernetes-sigs/prow/issues/new?title=Prow%20issue:) repository.

serathius commented 2 days ago

/ok-to-test

serathius commented 2 days ago

/retest

serathius commented 2 days ago

Great work, the change look good, but as any change in the etcd apply loop we need to be careful here.

I expected this patch to slow down etcd, but it seems to have improved rw-benchmark.sh results. I’m not sure how.

Would be good to investigate, would you be able to collect CPU&Memory profiles?

serathius commented 2 days ago

Found by robustness tests:

panic: need non-empty snapshot

goroutine 210 [running]:
go.etcd.io/raft/v3.(*raft).maybeSendAppend(0xc0002be180, 0xeabdbb777cf498cb, 0x1)
    go.etcd.io/raft/v3@v3.6.0-alpha.0/raft.go:640 +0x838
go.etcd.io/raft/v3.(*raft).sendAppend(...)
    go.etcd.io/raft/v3@v3.6.0-alpha.0/raft.go:592
go.etcd.io/raft/v3.stepLeader(0xc0002be180, {0x9, 0xbf19ae4419db00dc, 0xeabdbb777cf498cb, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
    go.etcd.io/raft/v3@v3.6.0-alpha.0/raft.go:1545 +0x99d
go.etcd.io/raft/v3.(*raft).Step(0xc0002be180, {0x9, 0xbf19ae4419db00dc, 0xeabdbb777cf498cb, 0x3, 0x0, 0x0, {0x0, 0x0, 0x0}, ...})
    go.etcd.io/raft/v3@v3.6.0-alpha.0/raft.go:1215 +0xe3c
go.etcd.io/raft/v3.(*node).run(0xc000018240)
    go.etcd.io/raft/v3@v3.6.0-alpha.0/node.go:399 +0x825
created by go.etcd.io/raft/v3.StartNode in goroutine 1
    go.etcd.io/raft/v3@v3.6.0-alpha.0/node.go:273 +0x59

clement2026 commented 2 days ago

Great work, the change look good, but as any change in the etcd apply loop we need to be careful here.

I expected this patch to slow down etcd, but it seems to have improved rw-benchmark.sh results. I’m not sure how.

Would be good to investigate, would you be able to collect CPU&Memory profiles?

Thanks! I understand the importance of the apply loop. I'll investigate it and provide CPU & Memory profiles after resolving the bug and test failures.

clement2026 commented 2 days ago

/retest

serathius commented 1 day ago

Thanks for fixing test, please note this PR is small, but might have big impact on etcd behavior, it's review might take a while. Would be great to repeat the tests and see the profile difference to understand performance difference.

cc @ahrtr

clement2026 commented 1 day ago

Thanks for fixing test, please note this PR is small, but might have big impact on etcd behavior, it's review might take a while. Would be great to repeat the tests and see the profile difference to understand performance difference.

cc @ahrtr

Thanks for checking in. I temporarily fixed the test to test my ideas, but it might cause a serious memory leak. Currently, I'm developing an optimized version to properly fixed the test. I’ll update you once it's done and provide the CPU and memory profiles.

clement2026 commented 15 hours ago

/retest

clement2026 commented 15 hours ago

/retest

clement2026 commented 14 hours ago

/retest

clement2026 commented 14 hours ago

/retest

clement2026 commented 14 hours ago

/retest