allegro / bigcache

Efficient cache for gigabytes of data written in Go.
http://allegro.tech/2016/03/writing-fast-cache-service-in-go.html
Apache License 2.0
7.45k stars 593 forks source link

Use defer to release the lock.(#401) #402

Closed zzdeys closed 1 month ago

janisz commented 1 month ago

Thanks for the patch but we use defer as it has better performacne in early Go days. I've did some benchmarks some time ago and still was a bit slower. Could you bench your changes?

zzdeys commented 1 month ago

Thanks for the patch but we use defer as it has better performacne in early Go days. I've did some benchmarks some time ago and still was a bit slower. Could you bench your changes?

I conducted two benchmark tests each on macOS and Linux, with the following operating system and Go version: macos: go version go1.22.1 darwin/arm64. linux:go version go1.22.5 linux/amd64 After summarizing and organizing the data, I concluded that performing the unlock operation within the defer statement is indeed slightly slower, but the performance difference is only 4%, which I consider acceptable.

The summarized results are as follows:

|                         |  old time/op  |  new time/op  | delta   |
|-----------------------------|------------|------------|---------------|
| Write                       | 16106.2    | 16864.3    | 0.047068831   |
| Append                      | 40551.8    | 43016.4    | 0.060776587   |
| Read                        | 16773.6    | 16999.9    | 0.013491439   |
| Iterate                     | 3150.4     | 3168.7     | 0.005808786   |
| ReadFromCacheNonExistentKeys| 4737.561   | 4850.682   | 0.023877476   |
| All                         | 81319.561  | 84899.982  | 0.044029025   |

The complete benchmark data is as follows:

go version go1.22.1 darwin/arm64

name                                                       old time/op    new time/op    delta
BenchmarkWriteToCacheWith1Shard-10                           713.1 ns/op    568.2 ns/op  -20.31%
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-10    277.3 ns/op    272.7 ns/op   -1.66%
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-10  759.4 ns/op    714.5 ns/op   -5.92%
BenchmarkWriteToCache/1-shards-10                            517.9 ns/op    539.3 ns/op   +4.13%
BenchmarkWriteToCache/512-shards-10                          169.1 ns/op    167.3 ns/op   -1.07%
BenchmarkWriteToCache/1024-shards-10                         134.3 ns/op    120.6 ns/op  -10.21%
BenchmarkWriteToCache/8192-shards-10                         107.8 ns/op    100.3 ns/op   -6.95%
BenchmarkAppendToCache/1-shards-10                          5593 ns/op     5354 ns/op     -4.27%
BenchmarkAppendToCache/512-shards-10                        2274 ns/op     2275 ns/op     +0.04%
BenchmarkAppendToCache/1024-shards-10                       2625 ns/op     1501 ns/op    -42.83%
BenchmarkAppendToCache/8192-shards-10                       2253 ns/op     3261 ns/op    +44.76%
BenchmarkReadFromCache/1-shards-10                           488.8 ns/op    493.3 ns/op    +0.92%
BenchmarkReadFromCache/512-shards-10                         468.9 ns/op    485.0 ns/op    +3.44%
BenchmarkReadFromCache/1024-shards-10                        470.5 ns/op    471.8 ns/op    +0.28%
BenchmarkReadFromCache/8192-shards-10                        489.5 ns/op    567.2 ns/op   +15.88%
BenchmarkReadFromCacheWithInfo/1-shards-10                   483.2 ns/op    492.4 ns/op    +1.90%
BenchmarkReadFromCacheWithInfo/512-shards-10                 479.4 ns/op    477.4 ns/op    -0.42%
BenchmarkReadFromCacheWithInfo/1024-shards-10                484.7 ns/op    483.8 ns/op    -0.19%
BenchmarkReadFromCacheWithInfo/8192-shards-10                497.6 ns/op    488.7 ns/op    -1.79%
BenchmarkIterateOverCache/512-shards-10                      290.7 ns/op    294.5 ns/op    +1.31%
BenchmarkIterateOverCache/1024-shards-10                     285.7 ns/op    289.5 ns/op    +1.33%
BenchmarkIterateOverCache/8192-shards-10                     287.1 ns/op    283.6 ns/op    -1.22%
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-10  263.9 ns/op    220.1 ns/op  -16.60%
BenchmarkReadFromCacheNonExistentKeys/1-shards-10            289.2 ns/op    301.9 ns/op    +4.40%
BenchmarkReadFromCacheNonExistentKeys/512-shards-10          290.9 ns/op    301.0 ns/op    +3.47%
BenchmarkReadFromCacheNonExistentKeys/1024-shards-10         286.9 ns/op    301.6 ns/op    +5.12%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-10         269.8 ns/op    296.0 ns/op    +9.71%

name                                                       old time/op    new time/op    delta
BenchmarkWriteToCacheWith1Shard-10                           508.3 ns/op    585.9 ns/op   +15.28%
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-10    277.0 ns/op    275.3 ns/op    -0.61%
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-10  669.8 ns/op    796.7 ns/op   +18.93%
BenchmarkWriteToCache/1-shards-10                            534.7 ns/op    513.3 ns/op    -4.00%
BenchmarkWriteToCache/512-shards-10                          152.2 ns/op    154.5 ns/op    +1.51%
BenchmarkWriteToCache/1024-shards-10                         127.7 ns/op    149.6 ns/op   +17.16%
BenchmarkWriteToCache/8192-shards-10                         103.3 ns/op    111.0 ns/op    +7.46%
BenchmarkAppendToCache/1-shards-10                          4947 ns/op     5155 ns/op     +4.21%
BenchmarkAppendToCache/512-shards-10                        2143 ns/op     3212 ns/op    +49.89%
BenchmarkAppendToCache/1024-shards-10                       1630 ns/op     2081 ns/op    +27.67%
BenchmarkAppendToCache/8192-shards-10                       3760 ns/op     4439 ns/op    +18.08%
BenchmarkReadFromCache/1-shards-10                           479.9 ns/op    482.6 ns/op    +0.56%
BenchmarkReadFromCache/512-shards-10                         481.8 ns/op    473.1 ns/op    -1.81%
BenchmarkReadFromCache/1024-shards-10                        483.2 ns/op    477.0 ns/op    -1.28%
BenchmarkReadFromCache/8192-shards-10                        519.3 ns/op    495.6 ns/op    -4.56%
BenchmarkReadFromCacheWithInfo/1-shards-10                   488.7 ns/op    491.1 ns/op    +0.49%
BenchmarkReadFromCacheWithInfo/512-shards-10                 493.2 ns/op    492.3 ns/op    -0.18%
BenchmarkReadFromCacheWithInfo/1024-shards-10                492.9 ns/op    495.0 ns/op    +0.43%
BenchmarkReadFromCacheWithInfo/8192-shards-10                495.3 ns/op    566.1 ns/op   +14.29%
BenchmarkIterateOverCache/512-shards-10                      288.6 ns/op    293.3 ns/op    +1.63%
BenchmarkIterateOverCache/1024-shards-10                     285.1 ns/op    283.9 ns/op    -0.42%
BenchmarkIterateOverCache/8192-shards-10                     281.7 ns/op    282.1 ns/op    +0.14%
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-10  207.0 ns/op    200.9 ns/op    -2.94%
BenchmarkReadFromCacheNonExistentKeys/1-shards-10            287.1 ns/op    286.0 ns/op    -0.38%
BenchmarkReadFromCacheNonExistentKeys/512-shards-10          295.5 ns/op    302.4 ns/op    +2.33%
BenchmarkReadFromCacheNonExistentKeys/1024-shards-10         295.5 ns/op    303.4 ns/op    +2.67%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-10         288.3 ns/op    305.3 ns/op    +5.90%

go version go1.22.5 linux/amd64

name                                                       old time/op    new time/op    delta
BenchmarkWriteToCacheWith1Shard-25                         1122 ns/op     1073 ns/op    -4.37%  
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-25  562.9 ns/op    551.3 ns/op   -2.06%  
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-25 2636 ns/op     2714 ns/op    +2.96% 
BenchmarkWriteToCache/1-shards-25                          1048 ns/op     1006 ns/op    -4.01% 
BenchmarkWriteToCache/512-shards-25                        118.5 ns/op    129.1 ns/op   +8.96% 
BenchmarkWriteToCache/1024-shards-25                       109.3 ns/op    115.9 ns/op   +6.04% 
BenchmarkWriteToCache/8192-shards-25                       119.6 ns/op    121.0 ns/op   +1.17% 
BenchmarkAppendToCache/1-shards-25                         4538 ns/op     5462 ns/op   +20.38% 
BenchmarkAppendToCache/512-shards-25                       925.6 ns/op    916.8 ns/op   -0.95% 
BenchmarkAppendToCache/1024-shards-25                      890.8 ns/op    880.9 ns/op   -1.11%  
BenchmarkAppendToCache/8192-shards-25                      946.7 ns/op    941.2 ns/op   -0.58% 
BenchmarkReadFromCache/1-shards-25                         560.5 ns/op    575.4 ns/op   +2.66% 
BenchmarkReadFromCache/512-shards-25                       556.2 ns/op    567.9 ns/op   +2.10% 
BenchmarkReadFromCache/1024-shards-25                      556.4 ns/op    570.9 ns/op   +2.61% 
BenchmarkReadFromCache/8192-shards-25                      568.0 ns/op    585.0 ns/op   +2.99% 
BenchmarkReadFromCacheWithInfo/1-shards-25                 606.8 ns/op    612.2 ns/op   +0.89% 
BenchmarkReadFromCacheWithInfo/512-shards-25               560.7 ns/op    585.2 ns/op   +4.37% 
BenchmarkReadFromCacheWithInfo/1024-shards-25              565.7 ns/op    555.6 ns/op   -1.79% 
BenchmarkReadFromCacheWithInfo/8192-shards-25              569.8 ns/op    573.0 ns/op   +0.56% 
BenchmarkIterateOverCache/512-shards-25                    274.6 ns/op    237.8 ns/op  -13.41% 
BenchmarkIterateOverCache/1024-shards-25                   226.4 ns/op    258.4 ns/op  +14.13% 
BenchmarkIterateOverCache/8192-shards-25                   203.6 ns/op    217.5 ns/op   +6.83% 
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-25 100.7 ns/op    100.2 ns/op   -0.50% 
BenchmarkReadFromCacheNonExistentKeys/1-shards-25          450.0 ns/op    459.7 ns/op   +2.16% 
BenchmarkReadFromCacheNonExistentKeys/512-shards-25        443.7 ns/op    458.8 ns/op   +3.40% 
BenchmarkReadFromCacheNonExistentKeys/1024-shards-25       443.6 ns/op    464.9 ns/op   +4.80%
BenchmarkReadFromCacheNonExistentKeys/8192-shards-25       450.6 ns/op    465.6 ns/op   +3.32% 
BenchmarkFnvHashSum64-25                                    5.570 ns/op    5.506 ns/op   -1.15% 
BenchmarkFnvHashStdLibSum64-25                              7.091 ns/op    7.076 ns/op   -0.21% 

name                                                       old time/op    new time/op    delta
BenchmarkWriteToCacheWith1Shard-25                         904.6 ns/op     1140 ns/op    +26.05%  
BenchmarkWriteToLimitedCacheWithSmallInitSizeAnd1Shard-25  537.7 ns/op     559.3 ns/op    +4.02%  
BenchmarkWriteToUnlimitedCacheWithSmallInitSizeAnd1Shard-25 2466 ns/op     3013 ns/op    +22.17%  
BenchmarkWriteToCache/1-shards-25                          1074 ns/op      1014 ns/op     -5.59%  
BenchmarkWriteToCache/512-shards-25                        126.5 ns/op     116.2 ns/op    -8.15%  
BenchmarkWriteToCache/1024-shards-25                       108.0 ns/op     119.1 ns/op    +10.28% 
BenchmarkWriteToCache/8192-shards-25                       121.2 ns/op     123.2 ns/op    +1.65%  
BenchmarkAppendToCache/1-shards-25                         5294 ns/op      4722 ns/op    -10.81%  
BenchmarkAppendToCache/512-shards-25                       923.4 ns/op     954.7 ns/op     +3.38% 
BenchmarkAppendToCache/1024-shards-25                      879.0 ns/op     895.3 ns/op     +1.85% 
BenchmarkAppendToCache/8192-shards-25                      929.3 ns/op     965.5 ns/op     +3.89% 
BenchmarkReadFromCache/1-shards-25                         546.4 ns/op     544.1 ns/op     -0.42% 
BenchmarkReadFromCache/512-shards-25                       545.4 ns/op     543.2 ns/op     -0.40% 
BenchmarkReadFromCache/1024-shards-25                      547.5 ns/op     547.0 ns/op     -0.09% 
BenchmarkReadFromCache/8192-shards-25                      553.3 ns/op     558.5 ns/op     +0.94% 
BenchmarkReadFromCacheWithInfo/1-shards-25                 575.7 ns/op     584.2 ns/op     +1.48% 
BenchmarkReadFromCacheWithInfo/512-shards-25               553.2 ns/op     552.1 ns/op     -0.20% 
BenchmarkReadFromCacheWithInfo/1024-shards-25              556.8 ns/op     551.9 ns/op     -0.88% 
BenchmarkReadFromCacheWithInfo/8192-shards-25              554.3 ns/op     561.3 ns/op     +1.26% 
BenchmarkIterateOverCache/512-shards-25                    263.4 ns/op     251.1 ns/op     -4.67% 
BenchmarkIterateOverCache/1024-shards-25                   245.0 ns/op     252.6 ns/op     +3.10% 
BenchmarkIterateOverCache/8192-shards-25                   218.5 ns/op     224.4 ns/op     +2.70% 
BenchmarkWriteToCacheWith1024ShardsAndSmallShardInitSize-25 101.1 ns/op     102.9 ns/op     +1.78%
BenchmarkReadFromCacheNonExistentKeys/1-shards-25          429.9 ns/op     445.9 ns/op     +3.73% 
BenchmarkReadFromCacheNonExistentKeys/512-shards-25        446.8 ns/op     446.8 ns/op      0.00% 
BenchmarkReadFromCacheNonExistentKeys/1024-shards-25       445.9 ns/op     447.7 ns/op     +0.40% 
BenchmarkReadFromCacheNonExistentKeys/8192-shards-25       448.0 ns/op     451.6 ns/op     +0.80% 
BenchmarkFnvHashSum64-25                                    5.528 ns/op     5.530 ns/op     +0.04%
BenchmarkFnvHashStdLibSum64-25                              7.116 ns/op     7.797 ns/op     +9.58%
janisz commented 1 month ago

Indeed it looks not that bad but I think since locking part is quite stable and we do not change often there is no need for sacrificing even 4% @cristaloleg @wendigo What do you think?

zzdeys commented 1 month ago

Indeed it looks not that bad but I think since locking part is quite stable and we do not change often there is no need for sacrificing even 4% @cristaloleg @wendigo What do you think?

I believe it is necessary. As mentioned in the issue #401 , if a panic occurs between lock and unlock, the lock will never be released. Even if a panic is recovered externally, the program cannot continue running

wendigo commented 1 month ago

I think that this change is specific to your usage that involves using panic() and recover() which feels like an antipattern to me. So 👎🏼 from me