vorner / arc-swap

Support atomic operations on Arc itself
Apache License 2.0
777 stars 31 forks source link

Use parking_lot instead of std::sync::RwLock on aarch64? #110

Closed PureWhiteWu closed 8 months ago

PureWhiteWu commented 8 months ago

Seems that this can bring huge performance improvements.

M3Max

rwlock_read/r1          time:   [146.64 ns 148.57 ns 150.54 ns]
rwlock_read/r3          time:   [414.54 ns 438.36 ns 461.85 ns]
rwlock_read/l1          time:   [126.28 ns 127.46 ns 128.74 ns]
rwlock_read/l3          time:   [1.3937 µs 1.4411 µs 1.4868 µs]
rwlock_read/rw          time:   [5.3935 µs 5.4586 µs 5.5110 µs]
rwlock_read/lw          time:   [5.4186 µs 5.5219 µs 5.6092 µs]
rwlock_read/w2          time:   [5.8282 µs 5.8665 µs 5.9040 µs]
rwlock_read/uncontended time:   [7.5437 ns 7.6723 ns 7.7910 ns]

rwlock_write/r1         time:   [3.6922 µs 3.7275 µs 3.7531 µs]
rwlock_write/r3         time:   [7.6615 µs 7.7270 µs 7.8212 µs]
rwlock_write/l1         time:   [3.3400 µs 3.3854 µs 3.4275 µs]
rwlock_write/l3         time:   [7.4758 µs 7.6461 µs 7.8111 µs]
rwlock_write/rw         time:   [5.9913 µs 6.0927 µs 6.2154 µs]
rwlock_write/lw         time:   [5.8719 µs 5.9004 µs 5.9332 µs]
rwlock_write/w2         time:   [5.6744 µs 5.7110 µs 5.7437 µs]
rwlock_write/uncontended
                        time:   [16.652 ns 16.780 ns 16.911 ns]

parking_rwlock_read/r1  time:   [88.069 ns 89.296 ns 90.559 ns]
parking_rwlock_read/r3  time:   [262.45 ns 272.84 ns 283.35 ns]
parking_rwlock_read/l1  time:   [62.874 ns 64.656 ns 67.062 ns]
parking_rwlock_read/l3  time:   [171.24 ns 199.09 ns 228.50 ns]
parking_rwlock_read/rw  time:   [76.006 ns 77.085 ns 78.206 ns]
parking_rwlock_read/lw  time:   [71.950 ns 72.937 ns 73.925 ns]
parking_rwlock_read/w2  time:   [80.502 ns 81.623 ns 82.801 ns]
parking_rwlock_read/uncontended
                        time:   [4.1132 ns 4.1448 ns 4.1781 ns]

parking_rwlock_write/r1 time:   [22.306 ns 22.408 ns 22.518 ns]
parking_rwlock_write/r3 time:   [32.481 ns 32.651 ns 32.832 ns]
parking_rwlock_write/l1 time:   [22.633 ns 22.815 ns 22.998 ns]
parking_rwlock_write/l3 time:   [32.497 ns 32.641 ns 32.798 ns]
parking_rwlock_write/rw time:   [50.884 ns 51.097 ns 51.294 ns]
parking_rwlock_write/lw time:   [49.763 ns 50.209 ns 50.694 ns]
parking_rwlock_write/w2 time:   [70.355 ns 70.689 ns 71.051 ns]
parking_rwlock_write/uncontended
                        time:   [13.331 ns 13.406 ns 13.490 ns]
vorner commented 8 months ago

Please, do have a look at where the RwLock is actually used. It is only a benchmarking and correctness-testing comparison. It is not used in „production“ setup at all.