microsoft / diskspd

DISKSPD is a storage load generator / performance test tool from the Windows/Windows Server and Cloud Server Infrastructure Engineering teams
MIT License
1.15k stars 212 forks source link

Higher than expected random write IOPS #129

Open hazemawadalla opened 4 years ago

hazemawadalla commented 4 years ago

I created a script that tests the random read/write performance of SSDs, after after sequential and random preconditioning. The test basically runs these functions in order:

Create_NTFS_Volume Sequential_Precondition Random_Precondition Random_Workload Sequential_Precondition Sequential_Workload Process_XML Delete_NTFS_Volume

The sequential precondtion flags: $sp = Start-Process -NoNewWindow -FilePath "$Diskspdpath" -ArgumentList "-b128k -d9000 -o128 -t1 -Suw -w100 -L -c$FileSize $DataFile" -PassThru $sp.WaitForExit()

The random precondition flags:

$rp = Start-Process -NoNewWindow -FilePath "$Diskspdpath" -ArgumentList "-b4k -d9000 -o32 -t4 -Suw -r -w100 -L  -c$FileSize $DataFile" -PassThru
 $rp.WaitForExit()

The random workload flags: $p = Start-Process -NoNewWindow -FilePath "$Diskspdpath" -ArgumentList "-b$bs -d$Time -o$qdepth -t$thread -Suw -r -Rxml -w$wp -L -c$FileSize $DataFile" -RedirectStandardOutput $stdOutLog -RedirectStandardError $stdErrLog -PassThru $p.WaitForExit()

However I am getting way higher than expected random write IOPS, way higher than the SSD is capable of. Is there something I am missing in the switches?

dl2n commented 4 years ago

Not specifying noncached IO would be the usual error, and you're doing that correctly with -Suw.

A possible problem is defeating your preconditioning by recreating the loadfile on each step. On deletion, unless you've disabled delete notification in the filesystem, this will issue a TRIM/UNMAP to the SSD which will clean out its LBA/PBA mapping. Don't do that.

Other than that, if you could mention the result you're getting and the device, I could help reason about whether the result really is unreasonable. The specific numbers might suggest other causes.


From: hazemawadalla notifications@github.com Sent: Tuesday, February 11, 2020 3:14 PM To: microsoft/diskspd diskspd@noreply.github.com Cc: Subscribed subscribed@noreply.github.com Subject: [microsoft/diskspd] Higher than expected random write IOPS (#129)

I created a script that tests the random read/write performance of SSDs, after after sequential and random preconditioning. The test basically runs these functions in order:

Create_NTFS_Volume Sequential_Precondition Random_Precondition Random_Workload Sequential_Precondition Sequential_Workload Process_XML Delete_NTFS_Volume

The sequential precondtion flags: $sp = Start-Process -NoNewWindow -FilePath "$Diskspdpath" -ArgumentList "-b128k -d9000 -o128 -t1 -Suw -w100 -L -c$FileSize $DataFile" -PassThru $sp.WaitForExit()

The random precondition flags:

$rp = Start-Process -NoNewWindow -FilePath "$Diskspdpath" -ArgumentList "-b4k -d9000 -o32 -t4 -Suw -r -w100 -L -c$FileSize $DataFile" -PassThru $rp.WaitForExit()

The random workload flags: $p = Start-Process -NoNewWindow -FilePath "$Diskspdpath" -ArgumentList "-b$bs -d$Time -o$qdepth -t$thread -Suw -r -Rxml -w$wp -L -c$FileSize $DataFile" -RedirectStandardOutput $stdOutLog -RedirectStandardError $stdErrLog -PassThru $p.WaitForExit()

However I am getting way higher than expected random write IOPS. Is there something I am missing in the switches?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/microsoft/diskspd/issues/129?email_source=notifications&email_token=ACCWSKKPV5XB4JXNZZJAQRTRCMWNLA5CNFSM4KTML7GKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4IMXVCZQ, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACCWSKKFKH2MFOJYJW74XKDRCMWNLANCNFSM4KTML7GA.

hazemawadalla commented 4 years ago

Hi Don,

I have tested not deleting the file on each step after you mentioned it, and I'm still getting high random write numbers. I work for a drive manufacturer and we are using fio for QoS testing. I will attach the fio results (which the drive was speced at), compared to the latest run of my script.

The 4K random write QD 32 with fio is 25400, compared to 81841 IOPS with diskspd. Can you help me understand the discrepancy?

Kingston_SEDC500R7680_7680G_J2.9 2-sample comparison_fiov317.xlsx

diskspdresults.zip

dl2n commented 4 years ago

I’ll take a look at what you attached shortly.

However: as a device maker I’d expect you have access to device statistics to confirm whether the number of operations or bytes transferred matches (roughly) what DISKSPD reports.

What is your expectation for the device capability?

Get Outlook for iOShttps://aka.ms/o0ukef


From: hazemawadalla notifications@github.com Sent: Friday, February 21, 2020 10:16:38 AM To: microsoft/diskspd diskspd@noreply.github.com Cc: Dan Lovinger dl2n@outlook.com; Comment comment@noreply.github.com Subject: Re: [microsoft/diskspd] Higher than expected random write IOPS (#129)

Hi Don,

I have tested not deleting the file on each step after you mentioned it, and I'm still getting high random write numbers. I work for a drive manufacturer and we are using fio for QoS testing. I will attach the fio results (which the drive was speced at), compared to the latest run of my script.

The 4K random write QD 32 with fio is 25400, compared to 81841 IOPS with diskspd. Can you help me understand the discrepancy?

Kingston_SEDC500R7680_7680G_J2.9 2-sample comparison_fiov317.xlsxhttps://github.com/microsoft/diskspd/files/4237432/Kingston_SEDC500R7680_7680G_J2.9.2-sample.comparison_fiov317.xlsx

diskspdresults.ziphttps://github.com/microsoft/diskspd/files/4237436/diskspdresults.zip

— You are receiving this because you commented. Reply to this email directly, view it on GitHubhttps://github.com/microsoft/diskspd/issues/129?email_source=notifications&email_token=ACCWSKJ2WJTZ5R7DLXTYTSTREALANA5CNFSM4KTML7GKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMTTEUY#issuecomment-589771347, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ACCWSKMR7274SUKCGULG65DREALANANCNFSM4KTML7GA.

hazemawadalla commented 4 years ago

Hey Dan, We designed the drive to be a read intensive SSD capable of 99,000 Read/26000 Write IOPS, @ 4k QD32 1 thread.

dl2n commented 4 years ago

I replied with this by the email connector on the 21st, but it appears not to have made it.

I pulled up all of the results (thanks for including the full sweep in XML form!).

Focusing on diskspeedrandom_1_100_4k_32, it looks OK: single thread QD32 random 4KB/4KB 100% unbuffered writethrough write. Load is to a ~7.5TB file which I assume is your device ~fully allocated.

The one thing that occurs to me is that you're using the default non-zero but constant fill pattern for the write buffer source (bytes are 0 - 1 - 2 - 3 .... - 255, repeating). Does your device have intelligence to detect constant buffer fill and optimize/dedup the operations? I'm not sure what FIO's default is, but if it is random or at least a pattern your device may not recognize in the IO path, that may be the difference.

Iff your device does have this intelligence, try this to overcome it: DISKSPD supports creating a random write source buffer with the -Z switch. A size on the order of a few 10's of MiB is usually plenty. In DISKSPD 2.0.17a the write source will be chosen at 4-byte aligned offsets within the buffer, 512-byte aligned in the most recent release to avoid processor architectural effects of sub-cacheline aligned buffers (several % overhead in certain cases).

Last, if you can get the host interface statistics that should strongly narrow down where the disconnect is.

111alan commented 4 years ago

I also ran into the same problem, the results are way higher than what iometer and fio shows. diskspd.exe -b4K -t16 -r -o16 -d39 -Sh E:\iobw.tst Ran on EPYC2-7702 and PM1725a, with a 20GB file with pseudo random content created with iometer. All 3 tests are done with the same file.

7702_windows_5x

However the same tests shows similiar results (about 800K IOPS) on Intel platforms.

dl2n commented 4 years ago

@hazemawadalla have likely root caused his issue offline. It has to do with differences in SSD preconditioning methodology between the two specific devices/platforms he was making his comparative runs on. It will take about a week to make the runs to confirm but I suspect that will close his specific issue.

If you open a separate issue, we can see about root causing yours.

111alan commented 4 years ago

@dl2n Thx, I've posted a new issue.

hazemawadalla commented 4 years ago

There is not an issue with diskspd per se, but a limitation. SSD preconditioning by capacity is an essential enhancement to diskspd. It is difficult to quantify your SSD performance at steady state vs FOB, especially with larger SSD capacities. With fio (and even other tools, like iometer) you can specify --loops=$ which ensures all LBAs are written to more than once. The SNIA spec recommends 2x capacity, and diskspd has no graceful way of doing that.

There is a workaround, but it is very tedious coding. Keep track of the total bytes written using windows perfmon and stop the random/sequential precondition process once you hit 2x capacity. Another easier workaround, is if you know your device's approximate datarate when performing random or sequential workloads, you can calculate the approximate amount of time it would take to precondition the device.