yzhan298 / ceph

Ceph is a distributed object, block, and file storage platform
http://ceph.com
Other
0 stars 0 forks source link

如何区分write 和 deferred write #4

Open yzhan298 opened 4 years ago

yzhan298 commented 4 years ago

如何区分direct write 和 deferred write?

以下ceph.conf可以用来控制direct write和deferred write:

bdev_block_size = 4096(default 4KB, should be <=  bluestore_min_alloc_size)
bluestore_min_alloc_size = 0(if 0, select from hdd or ssd values): we will never allocate this region below min_alloc_size
bluestore_min_alloc_size_hdd = 65536(default 64KB)
bluestore_min_alloc_size_ssd = 4096(default 16KB)
bluestore_max_alloc_size = 0
bluestore_prefer_deferred_size = 0
bluestore_prefer_deferred_size_hdd = 0
bluestore_prefer_deferred_size_ssd = 0

当我们将bluestore_prefer_deferred_size, hdd, ssd设置为0, 并将bluestore_min_alloc_size_ssd设置为4096,deferred writes就会最大程度被停止(我的4k测试里deferred writes占总writes的比例小于1%)。

以下perf counters可以用来查看simple write和deferred write:

"bluestore_write_big":对齐写的个数,不需要defer
"bluestore_write_big_bytes":对齐写的bytes
"bluestore_write_big_blobs":对齐写的blobs
"bluestore_write_small":小写(非对齐写)的个数,"small" writes (length < min_alloc_size)
"bluestore_write_small_bytes":小写的bytes
"bluestore_write_small_unused":amount of write requests that hit unused block in an existing extent
"bluestore_write_small_deferred":amount of write requests that hit unused block in an existing extent
"bluestore_write_small_new":amount of write requests that were immediately written to new location

以下conf可以用来控制throttle:

bluestore_throttle_bytes = 10720000:direct write path中总的budget = 10720000, we call it b1.
bluestore_throttle_deferred_bytes = 10720000: deferred wrtie中总的budget, we call it b2. b2 = b1 + b2 = 10720000 + 10720000.
bluestore_throttle_cost_per_io = 0:每个bluestore中io的cost
bluestore_throttle_cost_per_io_hdd = 670000
bluestore_throttle_cost_per_io_ssd = 4000