browserify / sha.js

Streamable SHA hashes in pure javascript
Other
288 stars 60 forks source link

SHA1 optimization #18

Closed dcousens closed 9 years ago

dcousens commented 9 years ago

This pull request unpacks the loops in SHA1 such that no branches exist, improving performance by about 16 to 20% >20%.

The loops haven't been entirely unrolled, as it would likely inhibit readability so severely with [likely] little gain.

run (N), input-size (bytes), ops (bytes/ms), time (ms)
1, 1, 34.47, 0.02901073397156948
3, 2, 89.8, 0.022271714922048998
5, 3, 139.65, 0.021482277121374866
7, 4, 174.88, 0.022872827081427266
8, 5, 217.9, 0.02294630564479119
9, 6, 322.38, 0.018611576400521124
10, 7, 345.38, 0.020267531414673693
11, 9, 411.3, 0.02188183807439825
12, 11, 475.53, 0.023132084200786492
13, 14, 571.76, 0.024485798237022526
14, 17, 750.04, 0.022665457842248413
15, 21, 1062.81, 0.019758940920766646
16, 25, 1193.25, 0.020951183741881416
17, 31, 1535.74, 0.020185708518368994
18, 38, 1576.24, 0.024108003857280617
19, 46, 2346, 0.0196078431372549
20, 56, 1538.32, 0.03640334910811795
21, 69, 1693.95, 0.04073319755600815
22, 84, 1973.16, 0.04257130693912303
23, 103, 2579.12, 0.039936102236421724
24, 126, 2125.62, 0.05927682276229994
25, 154, 2610.3, 0.058997050147492625
26, 188, 2498.52, 0.07524454477050414
27, 230, 3059, 0.07518796992481203
28, 282, 3262.74, 0.08643042350907519
29, 345, 3588, 0.09615384615384616
30, 422, 3422.42, 0.12330456226880394
31, 516, 3008.28, 0.17152658662092624
32, 631, 3672.42, 0.1718213058419244
33, 772, 4184.24, 0.18450184501845018
34, 944, 3624.96, 0.2604166666666667
35, 1155, 3915.45, 0.2949852507374631
36, 1413, 3716.19, 0.38022813688212925
37, 1728, 3749.76, 0.4608294930875576
38, 2113, 4395.04, 0.4807692307692308
39, 2585, 3877.5, 0.6666666666666666
40, 3162, 3857.64, 0.819672131147541
41, 3868, 3906.68, 0.9900990099009901
42, 4732, 3927.56, 1.2048192771084338
43, 5788, 5556.48, 1.0416666666666667
44, 7079, 3995.079207920792, 1.7719298245614035
45, 8660, 3983.6, 2.1739130434782608
46, 10593, 3985.4851485148515, 2.6578947368421053
47, 12957, 4016.67, 3.225806451612903
48, 15849, 4279.23, 3.7037037037037037
49, 19387, 3991.4411764705883, 4.857142857142857
50, 23714, 3952.3333333333335, 6
51, 29007, 3904.7884615384614, 7.428571428571429
52, 35481, 4133.708737864078, 8.583333333333334
53, 43401, 5477.796116504855, 7.923076923076923
54, 53088, 4247.04, 12.5
55, 64938, 4370.826923076923, 14.857142857142858
56, 79433, 4217.6814159292035, 18.833333333333332
57, 97163, 4337.633928571428, 22.4
58, 118850, 4401.851851851852, 27
59, 145378, 4076.0186915887853, 35.666666666666664
60, 177828, 4521.050847457627, 39.333333333333336
61, 217520, 4104.1509433962265, 53
62, 266073, 4927.277777777777, 54
63, 325462, 4520.305555555556, 72
64, 398107, 4767.748502994012, 83.5
65, 486968, 4234.504347826087, 115
66, 595662, 4445.238805970149, 134
67, 728618, 4670.628205128205, 156
68, 891251, 4594.077319587629, 194
69, 1090184, 5023.889400921659, 217
70, 1333521, 5148.72972972973, 259
71, 1631173, 5605.405498281787, 291
72, 1995262, 5334.92513368984, 374
73, 2440619, 5423.597777777778, 450
74, 2985383, 5497.942909760589, 543
75, 3651741, 5516.2250755287005, 662
76, 4466836, 5604.562107904642, 797
77, 5463865, 5644.488636363636, 968
78, 6683439, 5683.196428571428, 1176
79, 8175230, 5712.948986722572, 1431
80, 10000000, 5763.688760806916, 1735
feross commented 9 years ago

@dcousens The quality and quantity of your PRs to sha.js are impressive! If you're interested in taking a look at the buffer module (also used by browserify) I think you could make a real difference and I'd love to merge your PRs! Our current perf numbers are here: https://github.com/feross/buffer#performance

dominictarr commented 9 years ago

This is awesome!

Although it's still not as fast as: https://github.com/srijs/rusha , I would have used rusha in crypto-browserify except it unfortunately does not support streaming. (and only implements sha1) Athough, on recent versions I got this truly weird error:

https://github.com/srijs/rusha/issues/22

dominictarr commented 9 years ago

okay merged into 2.3.3

Also, this is now faster than @creationix's https://github.com/creationix/git-sha1

git-sha1:

> node bench-hash.js git-sha1
run (N), input-size (bytes), ops (bytes/ms), time (ms)
1, 1, 42.65, 0.023446658851113716
3, 2, 154.68, 0.012929919834497027
5, 3, 153.6, 0.01953125
7, 4, 177.2, 0.022573363431151242
8, 5, 201.75, 0.024783147459727387
9, 6, 247.44, 0.02424830261881668
10, 7, 298.62, 0.02344116268166901
11, 9, 410.13, 0.021944261575597982
12, 11, 479.93, 0.022920009168003668
13, 14, 569.1, 0.024600246002460024
14, 17, 694.79, 0.024467824810374357
15, 21, 830.34, 0.025290844714213456
16, 25, 996, 0.025100401606425703
17, 31, 1184.2, 0.02617801047120419
18, 38, 1480.86, 0.025660764690787787
19, 46, 1754.9, 0.02621231979030144
20, 56, 1092.56, 0.051255766273705795
21, 69, 1416.57, 0.04870920603994155
22, 84, 1706.88, 0.04921259842519685
23, 103, 2066.18, 0.049850448654037885
24, 126, 1854.72, 0.06793478260869565
25, 154, 2045.12, 0.07530120481927711
26, 188, 1932.64, 0.09727626459143969
27, 230, 2189.6, 0.10504201680672269
28, 282, 2016.3, 0.13986013986013987
29, 345, 4488.45, 0.07686395080707148
30, 422, 2477.14, 0.17035775127768313
31, 516, 2667.72, 0.19342359767891681
32, 631, 2751.16, 0.22935779816513763
33, 772, 3620.68, 0.21321961620469082
34, 944, 2520.48, 0.37453183520599254
35, 1155, 3210.9, 0.3597122302158273
36, 1413, 2289.06, 0.6172839506172839
37, 1728, 3576.96, 0.4830917874396135
38, 2113, 2260.91, 0.9345794392523364
39, 2585, 2300.65, 1.1235955056179776
40, 3162, 2308.26, 1.36986301369863
41, 3868, 2320.8, 1.6666666666666667
42, 4732, 2295.7227722772277, 2.061224489795918
43, 5788, 2315.2, 2.5
44, 7079, 2690.02, 2.6315789473684212
45, 8660, 2684.6, 3.225806451612903
46, 10593, 2517.1485148514853, 4.208333333333333
47, 12957, 2332.26, 5.555555555555555
48, 15849, 2221.822429906542, 7.133333333333334
49, 19387, 2215.657142857143, 8.75
50, 23714, 2347.920792079208, 10.1
51, 29007, 2320.56, 12.5
52, 35481, 2365.4, 15
53, 43401, 2411.1666666666665, 18
54, 53088, 2413.090909090909, 22
55, 64938, 2405.1111111111113, 27
56, 79433, 2336.264705882353, 34
57, 97163, 2491.358974358974, 39
58, 118850, 2330.392156862745, 51
59, 145378, 2506.5172413793102, 58
60, 177828, 2654.1492537313434, 67
61, 217520, 2843.3986928104573, 76.5
62, 266073, 3669.9724137931034, 72.5
63, 325462, 4368.61744966443, 74.5
64, 398107, 3686.175925925926, 108
65, 486968, 3745.9076923076923, 130
66, 595662, 5135.017241379311, 116
67, 728618, 4260.923976608187, 171
68, 891251, 4145.353488372093, 215
69, 1090184, 3935.682310469314, 277
70, 1333521, 4220.0031645569625, 316
71, 1631173, 4840.275964391692, 337
72, 1995262, 4586.809195402298, 435
73, 2440619, 4852.125248508946, 503
74, 2985383, 5034.3726812816185, 593
75, 3651741, 5107.33006993007, 715
76, 4466836, 5230.487119437939, 854
77, 5463865, 5474.814629258517, 998
78, 6683439, 5816.7441253263705, 1149
79, 8175230, 5781.633663366337, 1414
80, 10000000, 5701.254275940707, 1754

sha.js

> node bench-hash.js crypto-browserify
run (N), input-size (bytes), ops (bytes/ms), time (ms)
1, 1, 39.99, 0.025006251562890724
3, 2, 173.56, 0.011523392486748099
5, 3, 182.43, 0.0164446637066272
7, 4, 157.56, 0.025387154100025386
8, 5, 196.95, 0.025387154100025386
9, 6, 236.34, 0.025387154100025386
10, 7, 289.38, 0.024189646831156264
11, 9, 354.15, 0.025412960609911054
12, 11, 436.37, 0.025207965717166624
13, 14, 693, 0.020202020202020204
14, 17, 660.28, 0.025746652935118436
15, 21, 846.3, 0.02481389578163772
16, 25, 1077.25, 0.023207240659085634
17, 31, 1253.33, 0.02473410833539451
18, 38, 1467.56, 0.02589331952356292
19, 46, 1788.94, 0.025713551041398816
20, 56, 1508.08, 0.03713330857779428
21, 69, 2097.6, 0.03289473684210526
22, 84, 1996.68, 0.04206983592763988
23, 103, 3026.14, 0.03403675970047652
24, 126, 2143.26, 0.058788947677836566
25, 154, 2625.7, 0.05865102639296188
26, 188, 2492.88, 0.07541478129713423
27, 230, 3049.8, 0.07541478129713423
28, 282, 3062.52, 0.09208103130755065
29, 345, 3370.65, 0.1023541453428864
30, 422, 3346.46, 0.12610340479192939
31, 516, 2910.24, 0.1773049645390071
32, 631, 3546.22, 0.17793594306049823
33, 772, 3412.24, 0.22624434389140272
34, 944, 3577.76, 0.2638522427440633
35, 1155, 3187.8, 0.36231884057971014
36, 1413, 3377.07, 0.41841004184100417
37, 1728, 5616, 0.3076923076923077
38, 2113, 3866.79, 0.546448087431694
39, 2585, 3799.95, 0.6802721088435374
40, 3162, 3604.68, 0.8771929824561403
41, 3868, 7117.12, 0.5434782608695652
42, 4732, 3738.28, 1.2658227848101267
43, 5788, 4861.92, 1.1904761904761905
44, 7079, 3964.24, 1.7857142857142858
45, 8660, 3550.6, 2.4390243902439024
46, 10593, 3985.4851485148515, 2.6578947368421053
47, 12957, 5830.65, 2.2222222222222223
48, 15849, 7607.52, 2.0833333333333335
49, 19387, 5374.6138613861385, 3.607142857142857
50, 23714, 7278.554455445545, 3.2580645161290325
51, 29007, 6031.158415841584, 4.809523809523809
52, 35481, 4215.564356435643, 8.416666666666666
53, 43401, 4056.1682242990655, 10.7
54, 53088, 4044.8, 13.125
55, 64938, 4370.826923076923, 14.857142857142858
56, 79433, 5245.575471698113, 15.142857142857142
57, 97163, 4671.298076923077, 20.8
58, 118850, 4133.913043478261, 28.75
59, 145378, 4038.277777777778, 36
60, 177828, 4337.268292682927, 41
61, 217520, 3991.1926605504586, 54.5
62, 266073, 4223.380952380952, 63
63, 325462, 5660.208695652174, 57.5
64, 398107, 6580.280991735537, 60.5
65, 486968, 7161.294117647059, 68
66, 595662, 4108.013793103448, 145
67, 728618, 4956.585034013606, 147
68, 891251, 6062.931972789115, 147
69, 1090184, 7677.352112676056, 142
70, 1333521, 7449.8379888268155, 179
71, 1631173, 7380.873303167421, 221
72, 1995262, 6740.75, 296
73, 2440619, 8081.519867549669, 302
74, 2985383, 7596.394402035623, 393
75, 3651741, 7921.347071583514, 461
76, 4466836, 9023.91111111111, 495
77, 5463865, 9619.480633802817, 568
78, 6683439, 8725.116187989555, 766
79, 8175230, 8895.788900979325, 919
80, 10000000, 9082.652134423251, 1101
dominictarr commented 9 years ago

oops, I got confused because there where two commits with the same message: 26a75eca5f841850a05384d8a3a95e22ee8d9617 and bf46619c437f14c8aec95049fc054a13ee42c779

merged into 2.3.4

dcousens commented 9 years ago
* b044c10 (HEAD, parent/master) 2.3.4
*   a454744 Merge branch 'shaopt' of https://github.com/dcousens/sha.js
|\
| * 26a75ec (origin/shaopt, shaopt) sha1: use a closure over separate loops
* | d7c1638 2.3.3
* | bf46619 sha1: use a closure over seperate loops
|/
* f830142 sha1: unroll conditionals

Not the prettiest merge, and it clearly shows me fixing the spelling mistake in bf46619 haha.

Cheers though :) Maybe I'll look into rusha at some point. How much faster is it?

creationix commented 9 years ago

Congrats on beating mine. I didn't spend any time optimizing mine since it was never a bottleneck for me.

dominictarr commented 9 years ago

@creationix nevertheless, you did write a very fast sha1!