shogo82148 / Redis-Fast

fast perl binding for Redis database
https://metacpan.org/release/Redis-Fast
Other
25 stars 21 forks source link

Define PERL_NO_GET_CONTEXT #142

Closed JRaspass closed 1 year ago

JRaspass commented 1 year ago

This is more efficient on threaded perls and a no-op on non-threaded. Distro perls tend to be threaded by default so it would be a nice win.

See https://blogs.perl.org/users/nick_wellnhofer/2015/03/writing-xs-like-a-pro---perl-no-get-context-and-static-functions.html or https://perldoc.perl.org/perlguts#How-multiple-interpreters-and-concurrency-are-supported for details.

Seems to mostly be a wash in benchmarks on my laptop to my eye:

Before:

   00_ping: 10 wallclock secs ( 3.30 usr +  2.65 sys =  5.95 CPU) @ 71600.34/s (n=426022)
    10_set: 11 wallclock secs ( 1.82 usr +  3.35 sys =  5.17 CPU) @ 113881.04/s (n=588765)
  11_set_r: 12 wallclock secs ( 2.51 usr +  3.08 sys =  5.59 CPU) @ 88577.46/s (n=495148)
    20_get: 11 wallclock secs ( 1.79 usr +  3.22 sys =  5.01 CPU) @ 120583.83/s (n=604125)
  21_get_r: 10 wallclock secs ( 2.03 usr +  2.99 sys =  5.02 CPU) @ 109215.34/s (n=548261)
   30_incr: 11 wallclock secs ( 1.73 usr +  3.38 sys =  5.11 CPU) @ 122467.12/s (n=625807)
 30_incr_r: 12 wallclock secs ( 2.33 usr +  3.52 sys =  5.85 CPU) @ 107963.08/s (n=631584)
  40_lpush: 12 wallclock secs ( 2.00 usr +  3.68 sys =  5.68 CPU) @ 111348.59/s (n=632460)
   50_lpop: 11 wallclock secs ( 1.73 usr +  3.37 sys =  5.10 CPU) @ 124643.14/s (n=635680)
  90_h_get: 11 wallclock secs ( 2.51 usr +  2.94 sys =  5.45 CPU) @ 94156.70/s (n=513154)
  90_h_set: 10 wallclock secs ( 2.57 usr +  2.66 sys =  5.23 CPU) @ 84964.24/s (n=444363)

After:

     00_ping:  9 wallclock secs ( 2.86 usr +  2.35 sys =  5.21 CPU) @ 73578.31/s (n=383343)
    10_set: 10 wallclock secs ( 1.90 usr +  3.12 sys =  5.02 CPU) @ 118519.72/s (n=594969)
  11_set_r: 11 wallclock secs ( 2.21 usr +  2.80 sys =  5.01 CPU) @ 97914.57/s (n=490552)
    20_get: 12 wallclock secs ( 2.18 usr +  3.50 sys =  5.68 CPU) @ 111350.35/s (n=632470)
  21_get_r: 11 wallclock secs ( 2.23 usr +  3.50 sys =  5.73 CPU) @ 104246.60/s (n=597333)
   30_incr:  9 wallclock secs ( 1.78 usr +  3.31 sys =  5.09 CPU) @ 117863.06/s (n=599923)
 30_incr_r: 11 wallclock secs ( 1.99 usr +  3.12 sys =  5.11 CPU) @ 106215.07/s (n=542759)
  40_lpush: 12 wallclock secs ( 1.85 usr +  3.48 sys =  5.33 CPU) @ 115742.78/s (n=616909)
   50_lpop: 11 wallclock secs ( 1.84 usr +  3.27 sys =  5.11 CPU) @ 119098.43/s (n=608593)
  90_h_get: 10 wallclock secs ( 2.38 usr +  2.79 sys =  5.17 CPU) @ 91557.25/s (n=473351)
  90_h_set: 11 wallclock secs ( 2.81 usr +  3.10 sys =  5.91 CPU) @ 86829.61/s (n=513163)
JRaspass commented 1 year ago

Running GitHub Actions on my fork I believe the tests will fail on 5.10 as Test::Deep (and all of Ricardo's dists) now depend on 5.12, we might want to bump this dist (5.10 is ancient!) or pin an older Test::Deep for test requires.

shogo82148 commented 1 year ago

Thanks!