Closed mslovy closed 4 years ago
That job produces about 950MB - 1GB/s on a raw Optane P4800X device, what do you see for performance?
From: Ning Yao notifications@github.com Sent: Wednesday, June 10, 2020 6:51 AM To: Open-CAS/ocf ocf@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [Open-CAS/ocf] abnornal IO when direct=1 and fdatasync=1 in fio test (#378)
I test the IO pattern for direct sync IO in fio, the writeback cache device seems to be very slow. can anyone help to explain why? and how to optimaze it?
fio configuration is below
[global] name=fio-rand-write filename=/dev/cas1-12:/dev/cas1-1:/dev/cas1-2:/dev/cas1-3:/dev/cas1-4:/dev/cas1-5:/dev/cas1-6:/dev/cas1-7:/dev/cas1-8:/dev/cas1-9:/dev/cas1-10:/dev/cas1-11 rw=randwrite
bs=8K direct=1 fdatasync=0 numjobs=1 time_based=1 runtime=1000000000 norandommap random_distribution=zipf:1.2
[file1] ioengine=libaio iodepth=32
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/Open-CAS/ocf/issues/378, or unsubscribehttps://github.com/notifications/unsubscribe-auth/AAVNVEIRMMG6JK2JLRY2ETDRV6FTZANCNFSM4N2M4BYQ.
avg-cpu: %user %nice %system %iowait %steal %idle 1.31 0.00 8.43 0.08 0.00 90.18
Device: rrqm/s wrqm/s r/s w/s rMB/s wMB/s avgrq-sz avgqu-sz await r_await w_await svctm %util sdm 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdd 0.00 4.00 0.00 153.00 0.00 1.05 14.12 10.71 73.22 0.00 73.22 6.39 97.80 sdl 0.00 6.00 0.00 110.00 0.00 0.80 14.98 2.57 29.36 0.00 29.36 7.45 82.00 sde 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdj 0.00 7.00 0.00 134.00 0.00 0.96 14.69 3.58 31.22 0.00 31.22 6.13 82.10 sdf 0.00 2.00 0.00 133.00 0.00 0.91 14.08 3.22 28.62 0.00 28.62 6.71 89.20 sdh 0.00 7.00 0.00 148.00 0.00 1.07 14.81 4.30 28.30 0.00 28.30 5.83 86.30 sdg 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdi 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdb 0.00 3.00 0.00 157.00 0.00 1.10 14.37 8.28 54.17 0.00 54.17 5.96 93.60 sdk 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sdc 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 sda 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 nvme0n1 0.00 0.00 160.00 17562.00 0.62 131.58 15.28 0.48 0.03 0.25 0.02 0.01 19.10 cas1-1 0.00 0.00 0.00 2896.00 0.00 0.00 0.00 0.00 0.03 0.00 0.03 0.00 0.00 cas1-2 0.00 0.00 0.00 2895.00 0.00 22.62 16.00 0.00 2.70 0.00 2.70 0.00 0.00 cas1-3 0.00 0.00 0.00 2895.00 0.00 0.00 0.00 0.00 0.03 0.00 0.03 0.00 0.00 cas1-4 0.00 0.00 0.00 2892.00 0.00 22.59 16.00 0.00 3.60 0.00 3.60 0.00 0.00 cas1-5 0.00 0.00 0.00 2892.00 0.00 0.00 0.00 0.00 0.03 0.00 0.03 0.00 0.00 cas1-6 0.00 0.00 0.00 2894.00 0.00 22.61 16.00 0.00 1.21 0.00 1.21 0.00 0.00 cas1-7 0.00 0.00 0.00 2894.00 0.00 0.00 0.00 0.00 0.03 0.00 0.03 0.00 0.00 cas1-8 0.00 0.00 0.00 2889.00 0.00 22.57 16.00 0.00 1.36 0.00 1.36 0.00 0.00 cas1-9 0.00 0.00 0.00 2889.00 0.00 0.00 0.00 0.00 0.03 0.00 0.03 0.00 0.00 cas1-10 0.00 0.00 0.00 2892.00 0.00 22.59 16.00 0.00 1.32 0.00 1.32 0.00 0.00 cas1-11 0.00 0.00 0.00 2892.00 0.00 0.00 0.00 0.00 0.03 0.00 0.03 0.00 0.00 cas1-12 0.00 0.00 0.00 2896.00 0.00 22.62 16.00 0.00 1.07 0.00 1.07 0.00 0.00
I see in iostat, serveral devices like cas1-1, cas1-3 has received request, but always bandwidth and iops for them( rMB/s wMB/s avgrq-sz avgqu-sz are always zero). @fxober by the way, my cache device is Intel P3700 2TB. Is that something wrong or lock hang on IO path?
I change the jobs to below, then it looks like everything is right.
[global]
name=fio-rand-write
rw=randrw
#rate_iops=40000
bs=4K
direct=1
fdatasync=1
numjobs=4
time_based=1
runtime=1000000000
norandommap
random_distribution=zipf:1.6
[file1]
filename=/dev/cas1-1
ioengine=libaio
iodepth=1
[file2]
filename=/dev/cas1-2
ioengine=libaio
iodepth=1
[file3]
filename=/dev/cas1-3
ioengine=libaio
iodepth=1
[file4]
filename=/dev/cas1-4
ioengine=libaio
iodepth=1
[file5]
filename=/dev/cas1-5
ioengine=libaio
iodepth=1
[file6]
filename=/dev/cas1-6
ioengine=libaio
iodepth=1
Hi @mslovy,
I think poor performance when using filename=/dev/cas1-12:/dev/cas1-1:/dev/cas1-2:/dev/cas1-3:/dev/cas1-4:/dev/cas1-5:/dev/cas1-6:/dev/cas1-7:/dev/cas1-8:/dev/cas1-9:/dev/cas1-10:/dev/cas1-11
in fio config file is caused by only single fio thread is triggering IO to CAS devices. In second config you attached each CAS device has it's own thread so higher bandwidth is expected.
About strange iostat output you attached: I believe it is fio issue, not CAS, because running fio with single io thread to raw devices has exactly the same result.
Hi @mslovy,
I think poor performance when using
filename=/dev/cas1-12:/dev/cas1-1:/dev/cas1-2:/dev/cas1-3:/dev/cas1-4:/dev/cas1-5:/dev/cas1-6:/dev/cas1-7:/dev/cas1-8:/dev/cas1-9:/dev/cas1-10:/dev/cas1-11
in fio config file is caused by only single fio thread is triggering IO to CAS devices. In second config you attached each CAS device has it's own thread so higher bandwidth is expected.About strange iostat output you attached: I believe it is fio issue, not CAS, because running fio with single io thread to raw devices has exactly the same result.
thanks for your reply
@mslovy
Can we close the issue?
yeah,if any further questions, I will reopen it.
I test the IO pattern for direct sync IO in fio, the writeback cache device seems to be very slow. can anyone help to explain why? and how to optimaze it?
fio configuration is below
[global] name=fio-rand-write filename=/dev/cas1-12:/dev/cas1-1:/dev/cas1-2:/dev/cas1-3:/dev/cas1-4:/dev/cas1-5:/dev/cas1-6:/dev/cas1-7:/dev/cas1-8:/dev/cas1-9:/dev/cas1-10:/dev/cas1-11 rw=randwrite
rate_iops=40000
bs=8K direct=1 fdatasync=1 numjobs=1 time_based=1 runtime=1000000000 norandommap random_distribution=zipf:1.2
[file1] ioengine=libaio iodepth=32