Closed dangeReis closed 3 years ago
Hi :)
What's the CPU in this system? I can't find the exact level chosen for lz4 compression in zfs, but the references I'm reading seem to agree on ~500 MB/s compression per CPU core. If I compare that to lz4 -b
benchmarks on my current box (with a somewhat dated 2GHz Core i7), that would put it at either level 1 or 2, since the numbers I get here match (~550 MB/s for both levels), while level 3 takes a nosedive to about a tenth of that speed, so that rules that out.
2.34 GB/s looks like four saturated cores that are slightly more performant than mine. If this is a 4-core setup you're running, I'm guessing that would be on par with expected performance? You mention Amazon Linux, so keep in mind that for some configurations on AWS a vCPU is a thread on a hyperthreaded host CPU, so you may have to subtract from performance expectations if your instance type falls into that category.
BTW, I'm not an authority here, I just spotted your headline while looking for something else and found it interesting :D Sorry if I'm not contributing.
It's got 24 of these:
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 85
model name : Intel(R) Xeon(R) Platinum 8175M CPU @ 2.50GHz
stepping : 4
microcode : 0x200005e
cpu MHz : 3114.812
cache size : 33792 KB
physical id : 0
siblings : 24
core id : 0
cpu cores : 12
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ss ht syscall nx pdpe1gb rdtscp lm constant_tsc rep_good nopl xtopology nonstop_tsc cpuid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand hypervisor lahf_lm abm 3dnowprefetch invpcid_single pti fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm mpx avx512f avx512dq rdseed adx smap clflushopt clwb avx512cd avx512bw avx512vl xsaveopt xsavec xgetbv1 xsaves ida arat pku ospke
bugs : cpu_meltdown spectre_v1 spectre_v2 spec_store_bypass l1tf mds
bogomips : 5000.00
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:
Haha :D Cool!
Well, that ends my thoughts then, but I'll be following with interest.
Nice rig :)
Wait, just. If it's okay that I ask... I just looked up the CPU. (EDIT) At first, Google made me believe it was a different model putting your total CPU cost at $314,342.88. But, even without that error, why are you throwing 288 CPU cores at a 15TB (or 7.5TB mirrorred) zpool? Seems insane to me :/
Also, the only google results I get for your model are from cloud providers. Are you actually on AWS? Perhaps there's something wrong going on with them, then. I mean, 288 cores at 3.2 GHz, with the performance you're reporting, even if it seems insane to throw that many cores at ZFS when four or eight would do, that's something that's clearly off :)
EDIT 2: I just realized I'm the same as one of those people asking "why'd you wanna do that?" when someone asks a legitimate question. Sorry for that. Just ignore me. I'll still be following with interest :)
I only have 24 CPU cores. This is on AWS.
Something somewhere seems to be topping out at 2.4GB/s.
Edit: I see where you're getting 288, You're multiplying 24x12. I don't think that's how it works. There are a total of 24 CPUs in /proc/cpuinfo and in reality, only half of those are real cores and the rest is hyperthreading.
Ah. And no, I got the 288 from you saying "It's got 24 of these" and then showing the cpu info of a 12-core CPU :) A single one of those makes a lot more sense. But, I'll shut up now. I've already butted in way too much here. Apologies.
@dangeReis I'd suggest trying the same test but not setting --sync=1
and see if your results improve. Forcing all the IO to be synchronous means that ZFS must perform the compression before it can return from the system call. Letting ZFS handle it asynchronously should allow it to be more efficiently processed by the IO threads and streamed to disk.
This issue has been automatically marked as "stale" because it has not had any activity for a while. It will be closed in 90 days if no further activity occurs. Thank you for your contributions.
System information
Describe the problem you're observing
I'm seeing an order of magnitude slower speeds from zfs with compression versus no compression
Describe how to reproduce the problem
This machine has 2 7.5TB NVME drives. Here is fio on a single drive with blocksize of 128k.
As you can see a single drive can do over 1.5GB/s with this configuration.
Let's create a pool with ashift=12
Now lets test again with the same blocksize.
2.3GB/s. Not quite double, but very respectable.
Let's try with compression.
So we're still getting about 2.4GB/s but iotop says
zpool iostat -v 1
So we're getting about 2.4GB/s uncompressed, but once it gets compressed it only makes it to the disks at about 900MB/s.
with a compression ratio of 1.99, I would've expected the disks to be writing at around 2GB/s and the data to be written at around 4GB/s.
Include any warning/errors/backtraces from the system logs