I ran NDS @ 3TB in our performance cluster in this mode. Our performance cluster has really great bandwidth to disk and lots of host memory. In this case, the compression-enabled case was 2x slower than the case where compression was not enabled, and I would catalog this as a the worst case scenario both for the spill simulated here and the fact that anything we do adds overhead in this environment, given the fast IO.
We may observe different results in slower IO or restricted host memory scenarios (a cloud VM would be good to try), where performance may be up to par with compression or even beat the uncompressed case. We should find examples where compression can be a benefit in performance and use this data to modify the auto tuner for thresholds it could use, likely relative to disk bandwidth, overall capacity, and amount of spill detected in application history logs.
We should also further study the results in our performance cluster and see if we can speed up spill. For example, right now spill is done completely serially, one buffer at a time, but we know that LZ4's bandwidth for a single thread is a limiting factor (re: multi-threaded shuffle). We could employ similar mechanisms at spill time to speed up compression (and also encryption) when we have several blocks to spill. A related issue is: https://github.com/NVIDIA/spark-rapids/issues/7666, because if you look at the stack traces that's really holding everything up (all tasks end up waiting for the catalog lock, because one task is spilling). So there are steps we can take, and we can re-run the spill-all case to figure out if we improve our worst case.
Name = benchmark
Means = 3802000.0, 7187000.0
Time diff = -3385000.0
Speedup = 0.5290107137887853
T-Test (test statistic, p value, df) = -27.525783960660327, 0.02311795793190776, 1.0
T-Test Confidence Interval = -4947553.244415962, -1822446.7555840379
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
In terms of the amount spilled, with compression we spilled 12TB, without compression we spilled 25TB.
Full results
```
Name = query1
Means = 6415.0, 12243.0
Time diff = -5828.0
Speedup = 0.5239728824634485
T-Test (test statistic, p value, df) = -46.733296789404704, 0.013620323808451023, 1.0
T-Test Confidence Interval = -7412.561036590835, -4243.438963409165
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query3
Means = 8656.0, 12730.0
Time diff = -4074.0
Speedup = 0.6799685781618224
T-Test (test statistic, p value, df) = -196.01041638987795, 0.0032478592763304165, 1.0
T-Test Confidence Interval = -4338.093506098472, -3809.9064939015275
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query4
Means = 153913.0, 266906.5
Time diff = -112993.5
Speedup = 0.5766551208007299
T-Test (test statistic, p value, df) = -39.287460187701285, 0.01620064873435169, 1.0
T-Test Confidence Interval = -149537.43890637613, -76449.56109362387
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query6
Means = 2076.0, 3271.5
Time diff = -1195.5
Speedup = 0.634571297569922
T-Test (test statistic, p value, df) = -14.530994669814685, 0.04374219577439338, 1.0
T-Test Confidence Interval = -2240.870128306454, -150.12987169354642
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query7
Means = 34275.0, 75803.0
Time diff = -41528.0
Speedup = 0.4521588855322349
T-Test (test statistic, p value, df) = -41.196223331454945, 0.015450318657754002, 1.0
T-Test Confidence Interval = -54336.53504577591, -28719.464954224088
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query9
Means = 53082.0, 136078.5
Time diff = -82996.5
Speedup = 0.39008366494339664
T-Test (test statistic, p value, df) = -12.810600619381573, 0.04959419426750873, 1.0
T-Test Confidence Interval = -165316.64663011135, -676.3533698886458
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query11
Means = 78715.0, 144497.0
Time diff = -65782.0
Speedup = 0.5447517941548959
T-Test (test statistic, p value, df) = -19.47654123478562, 0.032657812876891255, 1.0
T-Test Confidence Interval = -108697.19474100177, -22866.805258998225
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query12
Means = 964.0, 1517.0
Time diff = -553.0
Speedup = 0.6354647330257086
T-Test (test statistic, p value, df) = -31.9274698861863, 0.019933045789918242, 1.0
T-Test Confidence Interval = -773.077921748727, -332.9220782512729
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query14_part1
Means = 115181.0, 221352.0
Time diff = -106171.0
Speedup = 0.5203521992121146
T-Test (test statistic, p value, df) = -18.30879791819945, 0.034736734446920055, 1.0
T-Test Confidence Interval = -179853.08820147382, -32488.911798526184
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query14_part2
Means = 109677.0, 206264.0
Time diff = -96587.0
Speedup = 0.5317311794593337
T-Test (test statistic, p value, df) = -21.01942346408533, 0.030264394239590434, 1.0
T-Test Confidence Interval = -154973.6726399373, -38200.327360062714
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query16
Means = 52068.0, 148311.0
Time diff = -96243.0
Speedup = 0.3510730829136072
T-Test (test statistic, p value, df) = -22.173153215330068, 0.02869184503407946, 1.0
T-Test Confidence Interval = -151394.527190231, -41091.472809769
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query17
Means = 7584.0, 13227.0
Time diff = -5643.0
Speedup = 0.5733726468586982
T-Test (test statistic, p value, df) = -814.4968922592645, 0.0007816106587309643, 1.0
T-Test Confidence Interval = -5731.031168699491, -5554.968831300509
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query22
Means = 8517.0, 16893.0
Time diff = -8376.0
Speedup = 0.5041733262297994
T-Test (test statistic, p value, df) = -19.038920687922463, 0.033407109622622284, 1.0
T-Test Confidence Interval = -13965.979212417667, -2786.0207875823335
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query23_part1
Means = 209113.0, 519179.5
Time diff = -310066.5
Speedup = 0.4027759185406974
T-Test (test statistic, p value, df) = -206.35962794430557, 0.0030849774036093864, 1.0
T-Test Confidence Interval = -329158.25971170206, -290974.74028829794
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query24_part1
Means = 113715.0, 196010.5
Time diff = -82295.5
Speedup = 0.5801474920986376
T-Test (test statistic, p value, df) = -20.608687520318735, 0.030866635017350957, 1.0
T-Test Confidence Interval = -133034.46485916903, -31556.53514083098
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query24_part2
Means = 117606.0, 198203.5
Time diff = -80597.5
Speedup = 0.5933598548966088
T-Test (test statistic, p value, df) = -24.44601435303959, 0.02602735258808861, 1.0
T-Test Confidence Interval = -122489.33240487019, -38705.6675951298
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query25
Means = 6139.0, 9359.5
Time diff = -3220.5
Speedup = 0.6559111063625194
T-Test (test statistic, p value, df) = -19.07032350692502, 0.03335219931563161, 1.0
T-Test Confidence Interval = -5366.259737050089, -1074.740262949911
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query26
Means = 16413.0, 40341.0
Time diff = -23928.0
Speedup = 0.4068565479289061
T-Test (test statistic, p value, df) = -23.179257116055982, 0.02744804303055482, 1.0
T-Test Confidence Interval = -37044.64413622413, -10811.355863775869
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query29
Means = 13873.0, 23951.0
Time diff = -10078.0
Speedup = 0.5792242495094151
T-Test (test statistic, p value, df) = -138.53657173554876, 0.004595239422516207, 1.0
T-Test Confidence Interval = -11002.327271344653, -9153.672728655347
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query31
Means = 14874.0, 21506.0
Time diff = -6632.0
Speedup = 0.6916209429926532
T-Test (test statistic, p value, df) = -46.132373316452984, 0.0137976878869646, 1.0
T-Test Confidence Interval = -8458.646750514436, -4805.353249485565
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query33
Means = 3552.0, 5333.0
Time diff = -1781.0
Speedup = 0.6660416276017251
T-Test (test statistic, p value, df) = -20.984914886259663, 0.030314087294278914, 1.0
T-Test Confidence Interval = -2859.3818165687626, -702.6181834312374
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query35
Means = 9942.0, 13643.5
Time diff = -3701.5
Speedup = 0.7286986477076997
T-Test (test statistic, p value, df) = -83.80635378060391, 0.007595958211485061, 1.0
T-Test Confidence Interval = -4262.698700459254, -3140.301299540746
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query36
Means = 20697.0, 37183.0
Time diff = -16486.0
Speedup = 0.5566253395368851
T-Test (test statistic, p value, df) = -85.74951835910063, 0.0074238424536921714, 1.0
T-Test Confidence Interval = -18928.86493141087, -14043.13506858913
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query38
Means = 27160.0, 43873.0
Time diff = -16713.0
Speedup = 0.6190595582704624
T-Test (test statistic, p value, df) = -16.162906279675404, 0.039337561907621145, 1.0
T-Test Confidence Interval = -29851.651928399006, -3574.3480716009963
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query40
Means = 4458.0, 7929.0
Time diff = -3471.0
Speedup = 0.5622398789254635
T-Test (test statistic, p value, df) = -250.4978480446489, 0.0025414046290069396, 1.0
T-Test Confidence Interval = -3647.0623373989815, -3294.9376626010185
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query42
Means = 2535.0, 3865.0
Time diff = -1330.0
Speedup = 0.6558861578266494
T-Test (test statistic, p value, df) = -127.97930967036704, 0.0049742948155542515, 1.0
T-Test Confidence Interval = -1462.0467530492363, -1197.9532469507637
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query43
Means = 15179.0, 25691.0
Time diff = -10512.0
Speedup = 0.5908294733564283
T-Test (test statistic, p value, df) = -13.079969891640832, 0.04857685119061344, 1.0
T-Test Confidence Interval = -20723.615569140937, -300.3844308590651
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query45
Means = 3242.0, 5554.0
Time diff = -2312.0
Speedup = 0.583723442563918
T-Test (test statistic, p value, df) = -34.226508265805506, 0.01859490638348658, 1.0
T-Test Confidence Interval = -3170.3038948200356, -1453.6961051799644
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query49
Means = 5589.0, 8519.0
Time diff = -2930.0
Speedup = 0.6560629181828853
T-Test (test statistic, p value, df) = -187.95958763617816, 0.003386971496688935, 1.0
T-Test Confidence Interval = -3128.0701295738545, -2731.9298704261455
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query50
Means = 40862.0, 54916.5
Time diff = -14054.5
Speedup = 0.7440750958273015
T-Test (test statistic, p value, df) = -25.719078790255452, 0.02474035940893158, 1.0
T-Test Confidence Interval = -20997.958431172337, -7111.041568827662
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query53
Means = 18280.0, 29691.5
Time diff = -11411.5
Speedup = 0.615664415741879
T-Test (test statistic, p value, df) = -16.207706265331893, 0.0392291037318845, 1.0
T-Test Confidence Interval = -20357.667519085753, -2465.332480914247
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query59
Means = 54282.0, 82104.0
Time diff = -27822.0
Speedup = 0.6611370944168372
T-Test (test statistic, p value, df) = -180.48358639768276, 0.00352726400587758, 1.0
T-Test Confidence Interval = -29780.69350356367, -25863.30649643633
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query61
Means = 5088.0, 8353.0
Time diff = -3265.0
Speedup = 0.6091224709685144
T-Test (test statistic, p value, df) = -117.81553930650801, 0.005403399998578176, 1.0
T-Test Confidence Interval = -3617.124674797963, -2912.875325202037
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query62
Means = 65220.0, 135827.5
Time diff = -70607.5
Speedup = 0.48016785996944655
T-Test (test statistic, p value, df) = -26.600495355175532, 0.023921363714386055, 1.0
T-Test Confidence Interval = -104334.44150799242, -36880.558492007585
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query64
Means = 115962.0, 262443.5
Time diff = -146481.5
Speedup = 0.4418551040509672
T-Test (test statistic, p value, df) = -25.121382283172483, 0.025328376725672595, 1.0
T-Test Confidence Interval = -220570.73235670896, -72392.26764329104
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query65
Means = 35580.0, 82682.0
Time diff = -47102.0
Speedup = 0.4303234077550132
T-Test (test statistic, p value, df) = -51.60218667812097, 0.012335525640321495, 1.0
T-Test Confidence Interval = -58700.10647615792, -35503.89352384208
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query66
Means = 41169.0, 67463.0
Time diff = -26294.0
Speedup = 0.6102456161155003
T-Test (test statistic, p value, df) = -15.459111993963361, 0.04112358088199367, 1.0
T-Test Confidence Interval = -47905.65191572499, -4682.348084275007
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query69
Means = 5726.0, 7168.0
Time diff = -1442.0
Speedup = 0.798828125
T-Test (test statistic, p value, df) = -277.5130293904801, 0.0022940076663786795, 1.0
T-Test Confidence Interval = -1508.023376524618, -1375.976623475382
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query71
Means = 7515.0, 10410.0
Time diff = -2895.0
Speedup = 0.7219020172910663
T-Test (test statistic, p value, df) = -417.85725732599167, 0.0015235311720830304, 1.0
T-Test Confidence Interval = -2983.031168699491, -2806.968831300509
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query72
Means = 35595.0, 64122.0
Time diff = -28527.0
Speedup = 0.5551136895293347
T-Test (test statistic, p value, df) = -43.803380662692696, 0.01453105217548824, 1.0
T-Test Confidence Interval = -36801.92985775214, -20252.070142247863
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query74
Means = 51937.0, 87044.5
Time diff = -35107.5
Speedup = 0.596671817288858
T-Test (test statistic, p value, df) = -16.580224601697168, 0.03834987389173396, 1.0
T-Test Confidence Interval = -62012.02593378188, -8202.974066218121
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query75
Means = 69209.0, 130502.5
Time diff = -61293.5
Speedup = 0.5303270052297848
T-Test (test statistic, p value, df) = -27.85345826412777, 0.022846227941127655, 1.0
T-Test Confidence Interval = -89254.39995817577, -33332.60004182423
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query78
Means = 121943.0, 238542.0
Time diff = -116599.0
Speedup = 0.5112013817273269
T-Test (test statistic, p value, df) = -23.645403595799497, 0.026907581835764048, 1.0
T-Test Confidence Interval = -179255.1843218626, -53942.815678137405
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query79
Means = 18694.0, 31362.5
Time diff = -12668.5
Speedup = 0.5960621761658031
T-Test (test statistic, p value, df) = -25.708829122069506, 0.02475021303093756, 1.0
T-Test Confidence Interval = -18929.716873751284, -6407.283126248716
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query80
Means = 16541.0, 33240.5
Time diff = -16699.5
Speedup = 0.4976158601705751
T-Test (test statistic, p value, df) = -31.559609886520967, 0.020165231636396767, 1.0
T-Test Confidence Interval = -23422.880509423612, -9976.119490576388
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query82
Means = 12153.0, 24937.5
Time diff = -12784.5
Speedup = 0.4873383458646617
T-Test (test statistic, p value, df) = -49.04408316581243, 0.012978763787733116, 1.0
T-Test Confidence Interval = -16096.672722318342, -9472.327277681658
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query86
Means = 7621.0, 11501.0
Time diff = -3880.0
Speedup = 0.6626380314755239
T-Test (test statistic, p value, df) = -21.133198532601398, 0.030101702837526164, 1.0
T-Test Confidence Interval = -6212.825970536507, -1547.1740294634933
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query87
Means = 27979.0, 46990.5
Time diff = -19011.5
Speedup = 0.5954182228322746
T-Test (test statistic, p value, df) = -21.128574865637287, 0.030108280319781196, 1.0
T-Test Confidence Interval = -30444.548034846368, -7578.45196515363
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query88
Means = 252115.0, 367616.5
Time diff = -115501.5
Speedup = 0.685809804510951
T-Test (test statistic, p value, df) = -26.818750097247364, 0.023726869273490603, 1.0
T-Test Confidence Interval = -170223.87524282097, -60779.12475717902
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query90
Means = 15768.0, 33103.5
Time diff = -17335.5
Speedup = 0.4763242557433504
T-Test (test statistic, p value, df) = -29.832058395042495, 0.021332134620458392, 1.0
T-Test Confidence Interval = -24719.11427466979, -9951.885725330209
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query91
Means = 1618.0, 2119.0
Time diff = -501.0
Speedup = 0.7635677206229353
T-Test (test statistic, p value, df) = -28.92524848640025, 0.022000375278840318, 1.0
T-Test Confidence Interval = -721.077921748727, -280.9220782512729
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query93
Means = 50638.0, 102276.5
Time diff = -51638.5
Speedup = 0.49510884709586267
T-Test (test statistic, p value, df) = -46.04401834061543, 0.013824156288149609, 1.0
T-Test Confidence Interval = -65888.54543323007, -37388.45456676993
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query95
Means = 209807.0, 354026.0
Time diff = -144219.0
Speedup = 0.5926316146271743
T-Test (test statistic, p value, df) = -24.685703668028058, 0.025774913774320417, 1.0
T-Test Confidence Interval = -218451.28300584562, -69986.71699415437
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query97
Means = 23448.0, 29446.0
Time diff = -5998.0
Speedup = 0.796305100862596
T-Test (test statistic, p value, df) = -15.59885997567286, 0.04075617088568512, 1.0
T-Test Confidence Interval = -10883.72986282174, -1112.2701371782596
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query98
Means = 3617.0, 4688.5
Time diff = -1071.5
Speedup = 0.7714620880878745
T-Test (test statistic, p value, df) = -16.94878940922422, 0.037517876548842255, 1.0
T-Test Confidence Interval = -1874.7844143828538, -268.2155856171463
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = query99
Means = 122054.0, 239049.5
Time diff = -116995.5
Speedup = 0.5105804446359437
T-Test (test statistic, p value, df) = -57.3164051073185, 0.011105985933025706, 1.0
T-Test Confidence Interval = -142931.6830780875, -91059.31692191251
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
--------------------------------------------------------------------
Name = benchmark
Means = 3802000.0, 7187000.0
Time diff = -3385000.0
Speedup = 0.5290107137887853
T-Test (test statistic, p value, df) = -27.525783960660327, 0.02311795793190776, 1.0
T-Test Confidence Interval = -4947553.244415962, -1822446.7555840379
ALERT: significant change has been detected (p-value < 0.05)
ALERT: regression in performance has been observed
```
Configuration used
No compression:
```
export SPARK_CONF=("--master" "spark://master-node:7077"
"--conf" "spark.shuffle.spill.compress=false"
"--conf" "spark.rapids.memory.host.spillStorageSize=1"
"--conf" "spark.locality.wait=0"
"--conf" "spark.plugins=com.nvidia.spark.SQLPlugin"
"--conf" "spark.sql.adaptive.enabled=true"
"--conf" "spark.sql.files.maxPartitionBytes=2gb"
"--conf" "spark.driver.maxResultSize=2GB"
"--conf" "spark.driver.memory=50G"
"--conf" "spark.executor.cores=16"
"--conf" "spark.executor.memory=16G"
"--conf" "spark.executor.resource.gpu.amount=1"
"--conf" "spark.task.resource.gpu.amount=0.0625"
"--conf" "spark.rapids.memory.pinnedPool.size=8g"
"--conf" "spark.rapids.sql.concurrentGpuTasks=4"
"--conf" "spark.executor.extraJavaOptions=-Dai.rapids.cudf.nvtx.enabled=true"
"--conf" "spark.driver.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR"
"--conf" "spark.executor.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR"
"--conf" "spark.shuffle.manager=com.nvidia.spark.rapids.spark321.RapidsShuffleManager"
"--conf" "spark.rapids.shuffle.multiThreaded.writer.threads=32"
"--conf" "spark.rapids.shuffle.multiThreaded.reader.threads=32"
"--conf" "spark.rapids.shuffle.mode=MULTITHREADED")
```
Compression:
```
export SPARK_CONF=("--master" "spark://master-node:7077"
"--conf" "spark.shuffle.spill.compress=true"
"--conf" "spark.rapids.memory.host.spillStorageSize=1"
"--conf" "spark.locality.wait=0"
"--conf" "spark.plugins=com.nvidia.spark.SQLPlugin"
"--conf" "spark.sql.adaptive.enabled=true"
"--conf" "spark.sql.files.maxPartitionBytes=2gb"
"--conf" "spark.driver.maxResultSize=2GB"
"--conf" "spark.driver.memory=50G"
"--conf" "spark.executor.cores=16"
"--conf" "spark.executor.memory=16G"
"--conf" "spark.executor.resource.gpu.amount=1"
"--conf" "spark.task.resource.gpu.amount=0.0625"
"--conf" "spark.rapids.memory.pinnedPool.size=8g"
"--conf" "spark.rapids.sql.concurrentGpuTasks=4"
"--conf" "spark.executor.extraJavaOptions=-Dai.rapids.cudf.nvtx.enabled=true"
"--conf" "spark.driver.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR"
"--conf" "spark.executor.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR"
"--conf" "spark.shuffle.manager=com.nvidia.spark.rapids.spark321.RapidsShuffleManager"
"--conf" "spark.rapids.shuffle.multiThreaded.writer.threads=32"
"--conf" "spark.rapids.shuffle.multiThreaded.reader.threads=32"
"--conf" "spark.rapids.shuffle.mode=MULTITHREADED")
```
### Tasks
- [ ] Run standard on Dataproc GPU cluster with slow disks
- [ ] Run with forced spill (reasonable use case) on Dataproc GPU cluster with slow disks
I tested the worst case for spill by changing our plugin so that it would spill every buffer: https://github.com/abellina/spark-rapids/commit/84e18f156a3478139164f5630ede4327f9655732 with this patch https://github.com/NVIDIA/spark-rapids/pull/9454. I then made sure that every buffer was spilled from device to disk, forcing compression and decompression to happen. Note that we don't run with
unspill
on by default, so every time a buffer is read we are reading it from disk compressed, and need to decompress.I ran NDS @ 3TB in our performance cluster in this mode. Our performance cluster has really great bandwidth to disk and lots of host memory. In this case, the compression-enabled case was 2x slower than the case where compression was not enabled, and I would catalog this as a the worst case scenario both for the spill simulated here and the fact that anything we do adds overhead in this environment, given the fast IO.
We may observe different results in slower IO or restricted host memory scenarios (a cloud VM would be good to try), where performance may be up to par with compression or even beat the uncompressed case. We should find examples where compression can be a benefit in performance and use this data to modify the auto tuner for thresholds it could use, likely relative to disk bandwidth, overall capacity, and amount of spill detected in application history logs.
We should also further study the results in our performance cluster and see if we can speed up spill. For example, right now spill is done completely serially, one buffer at a time, but we know that LZ4's bandwidth for a single thread is a limiting factor (re: multi-threaded shuffle). We could employ similar mechanisms at spill time to speed up compression (and also encryption) when we have several blocks to spill. A related issue is: https://github.com/NVIDIA/spark-rapids/issues/7666, because if you look at the stack traces that's really holding everything up (all tasks end up waiting for the catalog lock, because one task is spilling). So there are steps we can take, and we can re-run the spill-all case to figure out if we improve our worst case.
In terms of the amount spilled, with compression we spilled 12TB, without compression we spilled 25TB.
Full results
``` Name = query1 Means = 6415.0, 12243.0 Time diff = -5828.0 Speedup = 0.5239728824634485 T-Test (test statistic, p value, df) = -46.733296789404704, 0.013620323808451023, 1.0 T-Test Confidence Interval = -7412.561036590835, -4243.438963409165 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query3 Means = 8656.0, 12730.0 Time diff = -4074.0 Speedup = 0.6799685781618224 T-Test (test statistic, p value, df) = -196.01041638987795, 0.0032478592763304165, 1.0 T-Test Confidence Interval = -4338.093506098472, -3809.9064939015275 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query4 Means = 153913.0, 266906.5 Time diff = -112993.5 Speedup = 0.5766551208007299 T-Test (test statistic, p value, df) = -39.287460187701285, 0.01620064873435169, 1.0 T-Test Confidence Interval = -149537.43890637613, -76449.56109362387 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query6 Means = 2076.0, 3271.5 Time diff = -1195.5 Speedup = 0.634571297569922 T-Test (test statistic, p value, df) = -14.530994669814685, 0.04374219577439338, 1.0 T-Test Confidence Interval = -2240.870128306454, -150.12987169354642 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query7 Means = 34275.0, 75803.0 Time diff = -41528.0 Speedup = 0.4521588855322349 T-Test (test statistic, p value, df) = -41.196223331454945, 0.015450318657754002, 1.0 T-Test Confidence Interval = -54336.53504577591, -28719.464954224088 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query9 Means = 53082.0, 136078.5 Time diff = -82996.5 Speedup = 0.39008366494339664 T-Test (test statistic, p value, df) = -12.810600619381573, 0.04959419426750873, 1.0 T-Test Confidence Interval = -165316.64663011135, -676.3533698886458 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query11 Means = 78715.0, 144497.0 Time diff = -65782.0 Speedup = 0.5447517941548959 T-Test (test statistic, p value, df) = -19.47654123478562, 0.032657812876891255, 1.0 T-Test Confidence Interval = -108697.19474100177, -22866.805258998225 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query12 Means = 964.0, 1517.0 Time diff = -553.0 Speedup = 0.6354647330257086 T-Test (test statistic, p value, df) = -31.9274698861863, 0.019933045789918242, 1.0 T-Test Confidence Interval = -773.077921748727, -332.9220782512729 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query14_part1 Means = 115181.0, 221352.0 Time diff = -106171.0 Speedup = 0.5203521992121146 T-Test (test statistic, p value, df) = -18.30879791819945, 0.034736734446920055, 1.0 T-Test Confidence Interval = -179853.08820147382, -32488.911798526184 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query14_part2 Means = 109677.0, 206264.0 Time diff = -96587.0 Speedup = 0.5317311794593337 T-Test (test statistic, p value, df) = -21.01942346408533, 0.030264394239590434, 1.0 T-Test Confidence Interval = -154973.6726399373, -38200.327360062714 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query16 Means = 52068.0, 148311.0 Time diff = -96243.0 Speedup = 0.3510730829136072 T-Test (test statistic, p value, df) = -22.173153215330068, 0.02869184503407946, 1.0 T-Test Confidence Interval = -151394.527190231, -41091.472809769 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query17 Means = 7584.0, 13227.0 Time diff = -5643.0 Speedup = 0.5733726468586982 T-Test (test statistic, p value, df) = -814.4968922592645, 0.0007816106587309643, 1.0 T-Test Confidence Interval = -5731.031168699491, -5554.968831300509 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query22 Means = 8517.0, 16893.0 Time diff = -8376.0 Speedup = 0.5041733262297994 T-Test (test statistic, p value, df) = -19.038920687922463, 0.033407109622622284, 1.0 T-Test Confidence Interval = -13965.979212417667, -2786.0207875823335 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query23_part1 Means = 209113.0, 519179.5 Time diff = -310066.5 Speedup = 0.4027759185406974 T-Test (test statistic, p value, df) = -206.35962794430557, 0.0030849774036093864, 1.0 T-Test Confidence Interval = -329158.25971170206, -290974.74028829794 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query24_part1 Means = 113715.0, 196010.5 Time diff = -82295.5 Speedup = 0.5801474920986376 T-Test (test statistic, p value, df) = -20.608687520318735, 0.030866635017350957, 1.0 T-Test Confidence Interval = -133034.46485916903, -31556.53514083098 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query24_part2 Means = 117606.0, 198203.5 Time diff = -80597.5 Speedup = 0.5933598548966088 T-Test (test statistic, p value, df) = -24.44601435303959, 0.02602735258808861, 1.0 T-Test Confidence Interval = -122489.33240487019, -38705.6675951298 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query25 Means = 6139.0, 9359.5 Time diff = -3220.5 Speedup = 0.6559111063625194 T-Test (test statistic, p value, df) = -19.07032350692502, 0.03335219931563161, 1.0 T-Test Confidence Interval = -5366.259737050089, -1074.740262949911 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query26 Means = 16413.0, 40341.0 Time diff = -23928.0 Speedup = 0.4068565479289061 T-Test (test statistic, p value, df) = -23.179257116055982, 0.02744804303055482, 1.0 T-Test Confidence Interval = -37044.64413622413, -10811.355863775869 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query29 Means = 13873.0, 23951.0 Time diff = -10078.0 Speedup = 0.5792242495094151 T-Test (test statistic, p value, df) = -138.53657173554876, 0.004595239422516207, 1.0 T-Test Confidence Interval = -11002.327271344653, -9153.672728655347 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query31 Means = 14874.0, 21506.0 Time diff = -6632.0 Speedup = 0.6916209429926532 T-Test (test statistic, p value, df) = -46.132373316452984, 0.0137976878869646, 1.0 T-Test Confidence Interval = -8458.646750514436, -4805.353249485565 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query33 Means = 3552.0, 5333.0 Time diff = -1781.0 Speedup = 0.6660416276017251 T-Test (test statistic, p value, df) = -20.984914886259663, 0.030314087294278914, 1.0 T-Test Confidence Interval = -2859.3818165687626, -702.6181834312374 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query35 Means = 9942.0, 13643.5 Time diff = -3701.5 Speedup = 0.7286986477076997 T-Test (test statistic, p value, df) = -83.80635378060391, 0.007595958211485061, 1.0 T-Test Confidence Interval = -4262.698700459254, -3140.301299540746 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query36 Means = 20697.0, 37183.0 Time diff = -16486.0 Speedup = 0.5566253395368851 T-Test (test statistic, p value, df) = -85.74951835910063, 0.0074238424536921714, 1.0 T-Test Confidence Interval = -18928.86493141087, -14043.13506858913 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query38 Means = 27160.0, 43873.0 Time diff = -16713.0 Speedup = 0.6190595582704624 T-Test (test statistic, p value, df) = -16.162906279675404, 0.039337561907621145, 1.0 T-Test Confidence Interval = -29851.651928399006, -3574.3480716009963 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query40 Means = 4458.0, 7929.0 Time diff = -3471.0 Speedup = 0.5622398789254635 T-Test (test statistic, p value, df) = -250.4978480446489, 0.0025414046290069396, 1.0 T-Test Confidence Interval = -3647.0623373989815, -3294.9376626010185 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query42 Means = 2535.0, 3865.0 Time diff = -1330.0 Speedup = 0.6558861578266494 T-Test (test statistic, p value, df) = -127.97930967036704, 0.0049742948155542515, 1.0 T-Test Confidence Interval = -1462.0467530492363, -1197.9532469507637 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query43 Means = 15179.0, 25691.0 Time diff = -10512.0 Speedup = 0.5908294733564283 T-Test (test statistic, p value, df) = -13.079969891640832, 0.04857685119061344, 1.0 T-Test Confidence Interval = -20723.615569140937, -300.3844308590651 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query45 Means = 3242.0, 5554.0 Time diff = -2312.0 Speedup = 0.583723442563918 T-Test (test statistic, p value, df) = -34.226508265805506, 0.01859490638348658, 1.0 T-Test Confidence Interval = -3170.3038948200356, -1453.6961051799644 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query49 Means = 5589.0, 8519.0 Time diff = -2930.0 Speedup = 0.6560629181828853 T-Test (test statistic, p value, df) = -187.95958763617816, 0.003386971496688935, 1.0 T-Test Confidence Interval = -3128.0701295738545, -2731.9298704261455 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query50 Means = 40862.0, 54916.5 Time diff = -14054.5 Speedup = 0.7440750958273015 T-Test (test statistic, p value, df) = -25.719078790255452, 0.02474035940893158, 1.0 T-Test Confidence Interval = -20997.958431172337, -7111.041568827662 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query53 Means = 18280.0, 29691.5 Time diff = -11411.5 Speedup = 0.615664415741879 T-Test (test statistic, p value, df) = -16.207706265331893, 0.0392291037318845, 1.0 T-Test Confidence Interval = -20357.667519085753, -2465.332480914247 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query59 Means = 54282.0, 82104.0 Time diff = -27822.0 Speedup = 0.6611370944168372 T-Test (test statistic, p value, df) = -180.48358639768276, 0.00352726400587758, 1.0 T-Test Confidence Interval = -29780.69350356367, -25863.30649643633 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query61 Means = 5088.0, 8353.0 Time diff = -3265.0 Speedup = 0.6091224709685144 T-Test (test statistic, p value, df) = -117.81553930650801, 0.005403399998578176, 1.0 T-Test Confidence Interval = -3617.124674797963, -2912.875325202037 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query62 Means = 65220.0, 135827.5 Time diff = -70607.5 Speedup = 0.48016785996944655 T-Test (test statistic, p value, df) = -26.600495355175532, 0.023921363714386055, 1.0 T-Test Confidence Interval = -104334.44150799242, -36880.558492007585 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query64 Means = 115962.0, 262443.5 Time diff = -146481.5 Speedup = 0.4418551040509672 T-Test (test statistic, p value, df) = -25.121382283172483, 0.025328376725672595, 1.0 T-Test Confidence Interval = -220570.73235670896, -72392.26764329104 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query65 Means = 35580.0, 82682.0 Time diff = -47102.0 Speedup = 0.4303234077550132 T-Test (test statistic, p value, df) = -51.60218667812097, 0.012335525640321495, 1.0 T-Test Confidence Interval = -58700.10647615792, -35503.89352384208 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query66 Means = 41169.0, 67463.0 Time diff = -26294.0 Speedup = 0.6102456161155003 T-Test (test statistic, p value, df) = -15.459111993963361, 0.04112358088199367, 1.0 T-Test Confidence Interval = -47905.65191572499, -4682.348084275007 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query69 Means = 5726.0, 7168.0 Time diff = -1442.0 Speedup = 0.798828125 T-Test (test statistic, p value, df) = -277.5130293904801, 0.0022940076663786795, 1.0 T-Test Confidence Interval = -1508.023376524618, -1375.976623475382 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query71 Means = 7515.0, 10410.0 Time diff = -2895.0 Speedup = 0.7219020172910663 T-Test (test statistic, p value, df) = -417.85725732599167, 0.0015235311720830304, 1.0 T-Test Confidence Interval = -2983.031168699491, -2806.968831300509 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query72 Means = 35595.0, 64122.0 Time diff = -28527.0 Speedup = 0.5551136895293347 T-Test (test statistic, p value, df) = -43.803380662692696, 0.01453105217548824, 1.0 T-Test Confidence Interval = -36801.92985775214, -20252.070142247863 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query74 Means = 51937.0, 87044.5 Time diff = -35107.5 Speedup = 0.596671817288858 T-Test (test statistic, p value, df) = -16.580224601697168, 0.03834987389173396, 1.0 T-Test Confidence Interval = -62012.02593378188, -8202.974066218121 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query75 Means = 69209.0, 130502.5 Time diff = -61293.5 Speedup = 0.5303270052297848 T-Test (test statistic, p value, df) = -27.85345826412777, 0.022846227941127655, 1.0 T-Test Confidence Interval = -89254.39995817577, -33332.60004182423 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query78 Means = 121943.0, 238542.0 Time diff = -116599.0 Speedup = 0.5112013817273269 T-Test (test statistic, p value, df) = -23.645403595799497, 0.026907581835764048, 1.0 T-Test Confidence Interval = -179255.1843218626, -53942.815678137405 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query79 Means = 18694.0, 31362.5 Time diff = -12668.5 Speedup = 0.5960621761658031 T-Test (test statistic, p value, df) = -25.708829122069506, 0.02475021303093756, 1.0 T-Test Confidence Interval = -18929.716873751284, -6407.283126248716 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query80 Means = 16541.0, 33240.5 Time diff = -16699.5 Speedup = 0.4976158601705751 T-Test (test statistic, p value, df) = -31.559609886520967, 0.020165231636396767, 1.0 T-Test Confidence Interval = -23422.880509423612, -9976.119490576388 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query82 Means = 12153.0, 24937.5 Time diff = -12784.5 Speedup = 0.4873383458646617 T-Test (test statistic, p value, df) = -49.04408316581243, 0.012978763787733116, 1.0 T-Test Confidence Interval = -16096.672722318342, -9472.327277681658 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query86 Means = 7621.0, 11501.0 Time diff = -3880.0 Speedup = 0.6626380314755239 T-Test (test statistic, p value, df) = -21.133198532601398, 0.030101702837526164, 1.0 T-Test Confidence Interval = -6212.825970536507, -1547.1740294634933 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query87 Means = 27979.0, 46990.5 Time diff = -19011.5 Speedup = 0.5954182228322746 T-Test (test statistic, p value, df) = -21.128574865637287, 0.030108280319781196, 1.0 T-Test Confidence Interval = -30444.548034846368, -7578.45196515363 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query88 Means = 252115.0, 367616.5 Time diff = -115501.5 Speedup = 0.685809804510951 T-Test (test statistic, p value, df) = -26.818750097247364, 0.023726869273490603, 1.0 T-Test Confidence Interval = -170223.87524282097, -60779.12475717902 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query90 Means = 15768.0, 33103.5 Time diff = -17335.5 Speedup = 0.4763242557433504 T-Test (test statistic, p value, df) = -29.832058395042495, 0.021332134620458392, 1.0 T-Test Confidence Interval = -24719.11427466979, -9951.885725330209 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query91 Means = 1618.0, 2119.0 Time diff = -501.0 Speedup = 0.7635677206229353 T-Test (test statistic, p value, df) = -28.92524848640025, 0.022000375278840318, 1.0 T-Test Confidence Interval = -721.077921748727, -280.9220782512729 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query93 Means = 50638.0, 102276.5 Time diff = -51638.5 Speedup = 0.49510884709586267 T-Test (test statistic, p value, df) = -46.04401834061543, 0.013824156288149609, 1.0 T-Test Confidence Interval = -65888.54543323007, -37388.45456676993 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query95 Means = 209807.0, 354026.0 Time diff = -144219.0 Speedup = 0.5926316146271743 T-Test (test statistic, p value, df) = -24.685703668028058, 0.025774913774320417, 1.0 T-Test Confidence Interval = -218451.28300584562, -69986.71699415437 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query97 Means = 23448.0, 29446.0 Time diff = -5998.0 Speedup = 0.796305100862596 T-Test (test statistic, p value, df) = -15.59885997567286, 0.04075617088568512, 1.0 T-Test Confidence Interval = -10883.72986282174, -1112.2701371782596 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query98 Means = 3617.0, 4688.5 Time diff = -1071.5 Speedup = 0.7714620880878745 T-Test (test statistic, p value, df) = -16.94878940922422, 0.037517876548842255, 1.0 T-Test Confidence Interval = -1874.7844143828538, -268.2155856171463 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = query99 Means = 122054.0, 239049.5 Time diff = -116995.5 Speedup = 0.5105804446359437 T-Test (test statistic, p value, df) = -57.3164051073185, 0.011105985933025706, 1.0 T-Test Confidence Interval = -142931.6830780875, -91059.31692191251 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed -------------------------------------------------------------------- Name = benchmark Means = 3802000.0, 7187000.0 Time diff = -3385000.0 Speedup = 0.5290107137887853 T-Test (test statistic, p value, df) = -27.525783960660327, 0.02311795793190776, 1.0 T-Test Confidence Interval = -4947553.244415962, -1822446.7555840379 ALERT: significant change has been detected (p-value < 0.05) ALERT: regression in performance has been observed ```Configuration used
No compression: ``` export SPARK_CONF=("--master" "spark://master-node:7077" "--conf" "spark.shuffle.spill.compress=false" "--conf" "spark.rapids.memory.host.spillStorageSize=1" "--conf" "spark.locality.wait=0" "--conf" "spark.plugins=com.nvidia.spark.SQLPlugin" "--conf" "spark.sql.adaptive.enabled=true" "--conf" "spark.sql.files.maxPartitionBytes=2gb" "--conf" "spark.driver.maxResultSize=2GB" "--conf" "spark.driver.memory=50G" "--conf" "spark.executor.cores=16" "--conf" "spark.executor.memory=16G" "--conf" "spark.executor.resource.gpu.amount=1" "--conf" "spark.task.resource.gpu.amount=0.0625" "--conf" "spark.rapids.memory.pinnedPool.size=8g" "--conf" "spark.rapids.sql.concurrentGpuTasks=4" "--conf" "spark.executor.extraJavaOptions=-Dai.rapids.cudf.nvtx.enabled=true" "--conf" "spark.driver.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR" "--conf" "spark.executor.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR" "--conf" "spark.shuffle.manager=com.nvidia.spark.rapids.spark321.RapidsShuffleManager" "--conf" "spark.rapids.shuffle.multiThreaded.writer.threads=32" "--conf" "spark.rapids.shuffle.multiThreaded.reader.threads=32" "--conf" "spark.rapids.shuffle.mode=MULTITHREADED") ``` Compression: ``` export SPARK_CONF=("--master" "spark://master-node:7077" "--conf" "spark.shuffle.spill.compress=true" "--conf" "spark.rapids.memory.host.spillStorageSize=1" "--conf" "spark.locality.wait=0" "--conf" "spark.plugins=com.nvidia.spark.SQLPlugin" "--conf" "spark.sql.adaptive.enabled=true" "--conf" "spark.sql.files.maxPartitionBytes=2gb" "--conf" "spark.driver.maxResultSize=2GB" "--conf" "spark.driver.memory=50G" "--conf" "spark.executor.cores=16" "--conf" "spark.executor.memory=16G" "--conf" "spark.executor.resource.gpu.amount=1" "--conf" "spark.task.resource.gpu.amount=0.0625" "--conf" "spark.rapids.memory.pinnedPool.size=8g" "--conf" "spark.rapids.sql.concurrentGpuTasks=4" "--conf" "spark.executor.extraJavaOptions=-Dai.rapids.cudf.nvtx.enabled=true" "--conf" "spark.driver.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR" "--conf" "spark.executor.extraClassPath=$SPARK_RAPIDS_PLUGIN_JAR" "--conf" "spark.shuffle.manager=com.nvidia.spark.rapids.spark321.RapidsShuffleManager" "--conf" "spark.rapids.shuffle.multiThreaded.writer.threads=32" "--conf" "spark.rapids.shuffle.multiThreaded.reader.threads=32" "--conf" "spark.rapids.shuffle.mode=MULTITHREADED") ```