Closed zzdeys closed 1 month ago
Thanks for the patch but we use defer as it has better performacne in early Go days. I've did some benchmarks some time ago and still was a bit slower. Could you bench your changes?
I conducted two benchmark tests each on macOS and Linux, with the following operating system and Go version: macos: go version go1.22.1 darwin/arm64. linux:go version go1.22.5 linux/amd64 After summarizing and organizing the data, I concluded that performing the unlock operation within the defer statement is indeed slightly slower, but the performance difference is only 4%, which I consider acceptable.
The summarized results are as follows:
| | old time/op | new time/op | delta |
|-----------------------------|------------|------------|---------------|
| Write | 16106.2 | 16864.3 | 0.047068831 |
| Append | 40551.8 | 43016.4 | 0.060776587 |
| Read | 16773.6 | 16999.9 | 0.013491439 |
| Iterate | 3150.4 | 3168.7 | 0.005808786 |
| ReadFromCacheNonExistentKeys| 4737.561 | 4850.682 | 0.023877476 |
| All | 81319.561 | 84899.982 | 0.044029025 |
The complete benchmark data is as follows:
go version go1.22.1 darwin/arm64
name old time/op new time/op delta
BenchmarkWriteToCacheWith1Shard-10 713.1 ns/op 568.2 ns/op -20.31%
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-10 277.3 ns/op 272.7 ns/op -1.66%
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-10 759.4 ns/op 714.5 ns/op -5.92%
BenchmarkWriteToCache/1-shards-10 517.9 ns/op 539.3 ns/op +4.13%
BenchmarkWriteToCache/512-shards-10 169.1 ns/op 167.3 ns/op -1.07%
BenchmarkWriteToCache/1024-shards-10 134.3 ns/op 120.6 ns/op -10.21%
BenchmarkWriteToCache/8192-shards-10 107.8 ns/op 100.3 ns/op -6.95%
BenchmarkAppendToCache/1-shards-10 5593 ns/op 5354 ns/op -4.27%
BenchmarkAppendToCache/512-shards-10 2274 ns/op 2275 ns/op +0.04%
BenchmarkAppendToCache/1024-shards-10 2625 ns/op 1501 ns/op -42.83%
BenchmarkAppendToCache/8192-shards-10 2253 ns/op 3261 ns/op +44.76%
BenchmarkReadFromCache/1-shards-10 488.8 ns/op 493.3 ns/op +0.92%
BenchmarkReadFromCache/512-shards-10 468.9 ns/op 485.0 ns/op +3.44%
BenchmarkReadFromCache/1024-shards-10 470.5 ns/op 471.8 ns/op +0.28%
BenchmarkReadFromCache/8192-shards-10 489.5 ns/op 567.2 ns/op +15.88%
BenchmarkReadFromCacheWithInfo/1-shards-10 483.2 ns/op 492.4 ns/op +1.90%
BenchmarkReadFromCacheWithInfo/512-shards-10 479.4 ns/op 477.4 ns/op -0.42%
BenchmarkReadFromCacheWithInfo/1024-shards-10 484.7 ns/op 483.8 ns/op -0.19%
BenchmarkReadFromCacheWithInfo/8192-shards-10 497.6 ns/op 488.7 ns/op -1.79%
BenchmarkIterateOverCache/512-shards-10 290.7 ns/op 294.5 ns/op +1.31%
BenchmarkIterateOverCache/1024-shards-10 285.7 ns/op 289.5 ns/op +1.33%
BenchmarkIterateOverCache/8192-shards-10 287.1 ns/op 283.6 ns/op -1.22%
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-10 263.9 ns/op 220.1 ns/op -16.60%
BenchmarkReadFromCacheNonExistentKeys/1-shards-10 289.2 ns/op 301.9 ns/op +4.40%
BenchmarkReadFromCacheNonExistentKeys/512-shards-10 290.9 ns/op 301.0 ns/op +3.47%
BenchmarkReadFromCacheNonExistentKeys/1024-shards-10 286.9 ns/op 301.6 ns/op +5.12%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-10 269.8 ns/op 296.0 ns/op +9.71%
name old time/op new time/op delta
BenchmarkWriteToCacheWith1Shard-10 508.3 ns/op 585.9 ns/op +15.28%
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-10 277.0 ns/op 275.3 ns/op -0.61%
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-10 669.8 ns/op 796.7 ns/op +18.93%
BenchmarkWriteToCache/1-shards-10 534.7 ns/op 513.3 ns/op -4.00%
BenchmarkWriteToCache/512-shards-10 152.2 ns/op 154.5 ns/op +1.51%
BenchmarkWriteToCache/1024-shards-10 127.7 ns/op 149.6 ns/op +17.16%
BenchmarkWriteToCache/8192-shards-10 103.3 ns/op 111.0 ns/op +7.46%
BenchmarkAppendToCache/1-shards-10 4947 ns/op 5155 ns/op +4.21%
BenchmarkAppendToCache/512-shards-10 2143 ns/op 3212 ns/op +49.89%
BenchmarkAppendToCache/1024-shards-10 1630 ns/op 2081 ns/op +27.67%
BenchmarkAppendToCache/8192-shards-10 3760 ns/op 4439 ns/op +18.08%
BenchmarkReadFromCache/1-shards-10 479.9 ns/op 482.6 ns/op +0.56%
BenchmarkReadFromCache/512-shards-10 481.8 ns/op 473.1 ns/op -1.81%
BenchmarkReadFromCache/1024-shards-10 483.2 ns/op 477.0 ns/op -1.28%
BenchmarkReadFromCache/8192-shards-10 519.3 ns/op 495.6 ns/op -4.56%
BenchmarkReadFromCacheWithInfo/1-shards-10 488.7 ns/op 491.1 ns/op +0.49%
BenchmarkReadFromCacheWithInfo/512-shards-10 493.2 ns/op 492.3 ns/op -0.18%
BenchmarkReadFromCacheWithInfo/1024-shards-10 492.9 ns/op 495.0 ns/op +0.43%
BenchmarkReadFromCacheWithInfo/8192-shards-10 495.3 ns/op 566.1 ns/op +14.29%
BenchmarkIterateOverCache/512-shards-10 288.6 ns/op 293.3 ns/op +1.63%
BenchmarkIterateOverCache/1024-shards-10 285.1 ns/op 283.9 ns/op -0.42%
BenchmarkIterateOverCache/8192-shards-10 281.7 ns/op 282.1 ns/op +0.14%
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-10 207.0 ns/op 200.9 ns/op -2.94%
BenchmarkReadFromCacheNonExistentKeys/1-shards-10 287.1 ns/op 286.0 ns/op -0.38%
BenchmarkReadFromCacheNonExistentKeys/512-shards-10 295.5 ns/op 302.4 ns/op +2.33%
BenchmarkReadFromCacheNonExistentKeys/1024-shards-10 295.5 ns/op 303.4 ns/op +2.67%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-10 288.3 ns/op 305.3 ns/op +5.90%
go version go1.22.5 linux/amd64
name old time/op new time/op delta
BenchmarkWriteToCacheWith1Shard-25 1122 ns/op 1073 ns/op -4.37%
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-25 562.9 ns/op 551.3 ns/op -2.06%
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-25 2636 ns/op 2714 ns/op +2.96%
BenchmarkWriteToCache/1-shards-25 1048 ns/op 1006 ns/op -4.01%
BenchmarkWriteToCache/512-shards-25 118.5 ns/op 129.1 ns/op +8.96%
BenchmarkWriteToCache/1024-shards-25 109.3 ns/op 115.9 ns/op +6.04%
BenchmarkWriteToCache/8192-shards-25 119.6 ns/op 121.0 ns/op +1.17%
BenchmarkAppendToCache/1-shards-25 4538 ns/op 5462 ns/op +20.38%
BenchmarkAppendToCache/512-shards-25 925.6 ns/op 916.8 ns/op -0.95%
BenchmarkAppendToCache/1024-shards-25 890.8 ns/op 880.9 ns/op -1.11%
BenchmarkAppendToCache/8192-shards-25 946.7 ns/op 941.2 ns/op -0.58%
BenchmarkReadFromCache/1-shards-25 560.5 ns/op 575.4 ns/op +2.66%
BenchmarkReadFromCache/512-shards-25 556.2 ns/op 567.9 ns/op +2.10%
BenchmarkReadFromCache/1024-shards-25 556.4 ns/op 570.9 ns/op +2.61%
BenchmarkReadFromCache/8192-shards-25 568.0 ns/op 585.0 ns/op +2.99%
BenchmarkReadFromCacheWithInfo/1-shards-25 606.8 ns/op 612.2 ns/op +0.89%
BenchmarkReadFromCacheWithInfo/512-shards-25 560.7 ns/op 585.2 ns/op +4.37%
BenchmarkReadFromCacheWithInfo/1024-shards-25 565.7 ns/op 555.6 ns/op -1.79%
BenchmarkReadFromCacheWithInfo/8192-shards-25 569.8 ns/op 573.0 ns/op +0.56%
BenchmarkIterateOverCache/512-shards-25 274.6 ns/op 237.8 ns/op -13.41%
BenchmarkIterateOverCache/1024-shards-25 226.4 ns/op 258.4 ns/op +14.13%
BenchmarkIterateOverCache/8192-shards-25 203.6 ns/op 217.5 ns/op +6.83%
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-25 100.7 ns/op 100.2 ns/op -0.50%
BenchmarkReadFromCacheNonExistentKeys/1-shards-25 450.0 ns/op 459.7 ns/op +2.16%
BenchmarkReadFromCacheNonExistentKeys/512-shards-25 443.7 ns/op 458.8 ns/op +3.40%
BenchmarkReadFromCacheNonExistentKeys/1024-shards-25 443.6 ns/op 464.9 ns/op +4.80%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-25 450.6 ns/op 465.6 ns/op +3.32%
BenchmarkFnvHashSum64-25 5.570 ns/op 5.506 ns/op -1.15%
BenchmarkFnvHashStdLibSum64-25 7.091 ns/op 7.076 ns/op -0.21%
name old time/op new time/op delta
BenchmarkWriteToCacheWith1Shard-25 904.6 ns/op 1140 ns/op +26.05%
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-25 537.7 ns/op 559.3 ns/op +4.02%
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-25 2466 ns/op 3013 ns/op +22.17%
BenchmarkWriteToCache/1-shards-25 1074 ns/op 1014 ns/op -5.59%
BenchmarkWriteToCache/512-shards-25 126.5 ns/op 116.2 ns/op -8.15%
BenchmarkWriteToCache/1024-shards-25 108.0 ns/op 119.1 ns/op +10.28%
BenchmarkWriteToCache/8192-shards-25 121.2 ns/op 123.2 ns/op +1.65%
BenchmarkAppendToCache/1-shards-25 5294 ns/op 4722 ns/op -10.81%
BenchmarkAppendToCache/512-shards-25 923.4 ns/op 954.7 ns/op +3.38%
BenchmarkAppendToCache/1024-shards-25 879.0 ns/op 895.3 ns/op +1.85%
BenchmarkAppendToCache/8192-shards-25 929.3 ns/op 965.5 ns/op +3.89%
BenchmarkReadFromCache/1-shards-25 546.4 ns/op 544.1 ns/op -0.42%
BenchmarkReadFromCache/512-shards-25 545.4 ns/op 543.2 ns/op -0.40%
BenchmarkReadFromCache/1024-shards-25 547.5 ns/op 547.0 ns/op -0.09%
BenchmarkReadFromCache/8192-shards-25 553.3 ns/op 558.5 ns/op +0.94%
BenchmarkReadFromCacheWithInfo/1-shards-25 575.7 ns/op 584.2 ns/op +1.48%
BenchmarkReadFromCacheWithInfo/512-shards-25 553.2 ns/op 552.1 ns/op -0.20%
BenchmarkReadFromCacheWithInfo/1024-shards-25 556.8 ns/op 551.9 ns/op -0.88%
BenchmarkReadFromCacheWithInfo/8192-shards-25 554.3 ns/op 561.3 ns/op +1.26%
BenchmarkIterateOverCache/512-shards-25 263.4 ns/op 251.1 ns/op -4.67%
BenchmarkIterateOverCache/1024-shards-25 245.0 ns/op 252.6 ns/op +3.10%
BenchmarkIterateOverCache/8192-shards-25 218.5 ns/op 224.4 ns/op +2.70%
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-25 101.1 ns/op 102.9 ns/op +1.78%
BenchmarkReadFromCacheNonExistentKeys/1-shards-25 429.9 ns/op 445.9 ns/op +3.73%
BenchmarkReadFromCacheNonExistentKeys/512-shards-25 446.8 ns/op 446.8 ns/op 0.00%
BenchmarkReadFromCacheNonExistentKeys/1024-shards-25 445.9 ns/op 447.7 ns/op +0.40%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-25 448.0 ns/op 451.6 ns/op +0.80%
BenchmarkFnvHashSum64-25 5.528 ns/op 5.530 ns/op +0.04%
BenchmarkFnvHashStdLibSum64-25 7.116 ns/op 7.797 ns/op +9.58%
Indeed it looks not that bad but I think since locking part is quite stable and we do not change often there is no need for sacrificing even 4% @cristaloleg @wendigo What do you think?
Indeed it looks not that bad but I think since locking part is quite stable and we do not change often there is no need for sacrificing even 4% @cristaloleg @wendigo What do you think?
I believe it is necessary. As mentioned in the issue #401 , if a panic occurs between lock and unlock, the lock will never be released. Even if a panic is recovered externally, the program cannot continue running
I think that this change is specific to your usage that involves using panic()
and recover()
which feels like an antipattern to me. So 👎🏼 from me
Thanks for the patch but we use defer as it has better performacne in early Go days. I've did some benchmarks some time ago and still was a bit slower. Could you bench your changes?
37