oracle / truffleruby

A high performance implementation of the Ruby programming language, built on GraalVM.
https://www.graalvm.org/ruby/
Other
2.98k stars 180 forks source link

[GR-18163] Optimize Integer#pow #3552

Closed graalvmbot closed 2 months ago

graalvmbot commented 2 months ago

GitHub issue https://github.com/oracle/truffleruby/issues/3544

Added an implementation for small values (that fit into Java long) to complement a version that converts parameters into Java BigInteger.

Benchmarking

Before

jt -u jvm-ce ruby pow-bench.rb
truffleruby 24.1.0-dev-0a3f28ef, like ruby 3.2.2, GraalVM CE JVM [x86_64-darwin]
Warming up --------------------------------------
              modinv    33.015k i/100ms
             modinv1    85.451k i/100ms
             modinv2     2.682k i/100ms
             modinv3     3.738k i/100ms
             modinv4     1.649k i/100ms
Calculating -------------------------------------
              modinv    514.000k (± 9.4%) i/s -      2.542M in   5.055563s
             modinv1    851.831k (± 2.2%) i/s -      4.273M in   5.018367s
             modinv2     51.314k (±15.9%) i/s -    244.062k in   5.006361s
             modinv3     42.923k (±11.7%) i/s -    213.066k in   5.076313s
             modinv4     31.035k (±14.7%) i/s -    150.059k in   5.010427s

After

jt -u jvm-ce ruby pow-bench.rb
truffleruby 24.1.0-dev-0a3f28ef, like ruby 3.2.2, GraalVM CE JVM [x86_64-darwin]
Warming up --------------------------------------
              modinv    30.705k i/100ms
             modinv1    81.037k i/100ms
             modinv2     2.424k i/100ms
             modinv3     3.718k i/100ms
             modinv4    86.200k i/100ms
Calculating -------------------------------------
              modinv    503.256k (± 4.0%) i/s -      2.518M in   5.011915s
             modinv1    828.461k (± 5.6%) i/s -      4.133M in   5.006794s
             modinv2     50.344k (±14.7%) i/s -    242.400k in   5.007805s
             modinv3     45.251k (±13.0%) i/s -    219.362k in   5.025121s
             modinv4    846.609k (± 4.7%) i/s -      4.224M in   5.001086s

Script

require 'benchmark/ips'

def modinv(a0, m0)
  return 1 if m0 == 1
  a, m = a0, m0
  x0, inv = 0, 1
  while a > 1
    inv -= (a / m) * x0
    a, m = m, a % m
    x0, inv = inv, x0
  end
  inv += m0 if inv < 0
  inv
end

def modinv1(r, m)
  inv = r
  while (r * inv) % m != 1; inv %= m; inv *= r end
  inv % m
end

def modinv2(r, m)
  inv = r
  while (r * inv) % m != 1; inv = (inv * r) end
  inv #% m
end

def modinv3(r, m, rescnt)
  (r ** (rescnt-1)) % m
end

def modinv4(r, m, rescnt)
  r.pow(rescnt-1, m)
end

resp7 = [11,13,17,19,23,29,31,37,41,43,47,53,59,61,67,71,73,79,83,89,
          97,101,103,107,109,113,121,127,131,137,139,143,149,151,157,163,
         167,169,173,179,181,187,191,193,197,199,209,211]

Benchmark.ips do |x|
  x.report("modinv") { resp7.each { |r| modinv(r, 210) } }
  x.report("modinv1") { resp7.each { |r| modinv1(r, 210) } }
  x.report("modinv2") { resp7.each { |r| modinv2(r, 210) } }
  x.report("modinv3") { resp7.each { |r| modinv3(r, 210, 48) } }
  x.report("modinv4") { resp7.each { |r| modinv4(r, 210, 48) } }
end