Closed matte21 closed 5 years ago
@matte21 are you asking for an example of fio usage? Here is an incantation I have used in the past I will add it to the docs unless you would like to or if you have a better version feel free to improve mine.
fio --randrepeat=1 \
--ioengine=libaio \
--direct=1 \
--gtod_reduce=1 \
--name=etcd-disk-io-test \
--filename=etcd_read_write.io \
--bs=4k --iodepth=64 --size=4G \
--readwrite=randrw --rwmixread=75
Does this answer your question?
In general I agree I should of done it a while ago, thanks for the reminder.
@hexfusion I am measuring Etcd performance (we're using SSDs) and seeing that both backend commit and WAL f(data)sync duration are above recommended thresholds reported at https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#what-does-the-etcd-warning-apply-entries-took-too-long-mean and https://github.com/etcd-io/etcd/blob/master/Documentation/faq.md#what-does-the-etcd-warning-failed-to-send-out-heartbeat-on-time-mean, and was looking for a fio job file to benchmark those. But the point of the issue was abstracting from my personal use case and have some fio job files added to the docs. I would have done it myself if I was able to, unfortunately I am very inexperienced with fio, disk I/O and Etcd.
No problem at all I will add this now
@hexfusion : we wrote up something like what we think is needed. See https://www.ibm.com/blogs/bluemix/2019/04/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd/
@MikeSpreitzer thanks for doing this I am excited to read it over the weekend, l will think on where to best link this from the docs but if you have a vision please open PR and we can add.
Any news here?
I opened a PR: https://github.com/etcd-io/etcd/pull/10685
@hexfusion : we wrote up something like what we think is needed. See https://www.ibm.com/blogs/bluemix/2019/04/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd/
This link seems to be broken now.
Thanks. So...there's one huge discrepancy between https://github.com/etcd-io/etcd/issues/10577#issuecomment-475624306 and that blog entry, which is --direct=1
in the former and not the latter. Does etcd really use O_DIRECT
? It doesn't look like it to me. Using O_DIRECT
(or not) has a lot of implications.
Does etcd really use O_DIRECT?
I don't remember for sure.
But the fio parameters in the blog entry produce a disk I/O which is much more similar to etcd's than the fio parameters in #10577 (comment) (at least that was the case when we wrote the blog post).
The fio parameters in the blog post were derived by comparing the system calls traces of fio and etcd and by trying to make them as similar as possible in the parts that affect disk I/O. I clearly remember that using #10577 (comment) the system calls trace portion describing disk I/O was significantly different than etcd's.
So I'd say if --direct=1
is missing from the blog entry you should not use it (re-added warning: this was true some time ago).
See https://www.ibm.com/cloud/blog/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd
This link is also broken, I'll leave here an archive.org link just in case someone comes looking for it as me: https://web.archive.org/web/20210527090640/https://www.ibm.com/cloud/blog/using-fio-to-tell-whether-your-storage-is-fast-enough-for-etcd
Disk performance is paramount to Etcd. https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md suggests measuring it with fio. But disk I/O can happen in a lot of different ways and fio is complex to use. For a user who is not experienced with Etcd disk I/O and/or fio, but needs to asses whether its storage lives up to the requirements Etcd has, writing a meaningful fio job file which does I/O in the same way Etcd does is hard.
I think having such a file or at least some guidelines on how to write such a file would be extremely beneficial for the users. There are different disk metrics which are crucial to Etcd (WAL f(data)sync duration, backend commit time). Maybe one file for each metric is needed? Maybe the cli parameters in https://github.com/etcd-io/etcd/issues/10414#issuecomment-455227063 are good candidates? @hexfusion what do you think? In the comment you wrote you wanted to add something similar to the repo.