HCLarsen / devise-uncommon_password

Devise extension to prevent users from using a common password.
MIT License
28 stars 20 forks source link

Perform password lookup ~40.000x faster #31

Closed madejejej closed 2 years ago

madejejej commented 2 years ago

The current lookup method is a bit inefficient. For every lookup, it has to open a file, allocate a lot of memory and iterate over an Array.

The new method introduces two optimizations:

Benchmark: (note: I created a benchmark directory in the repo and not checked it in, so you might have to modify the file path to run it yourself)

require 'benchmark/ips'
require 'set'

PASSWORD_MATCHES = [100, 1_000, 10_000]

def common_passwords_old
  passwords_file = File.join(File.dirname(__FILE__), '..', 'lib', 'devise', 'uncommon_password', "passwords.txt")

  passwords = []
  File.open(passwords_file, "r") do |file|
    file.each { |password| passwords << password.chomp.downcase }
  end
  passwords.select! {|password| (8..128).include? password.length }
  passwords[0..$matches-1]
end

def common_passwords_new
  return @passwords if @passwords

  passwords_file = File.join(File.dirname(__FILE__), '..', 'lib', 'devise', 'uncommon_password', "passwords.txt")
  passwords = []
  File.open(passwords_file, "r") do |file|
    file.each { |password| passwords << password.chomp.downcase }
  end
  passwords.select! {|password| (8..128).include? password.length }

  # Note: We can memoize the whole list for the purpose of the spec
  @passwords = passwords.to_set
end

passwords_file = File.join(File.dirname(__FILE__), '..', 'lib', 'devise', 'uncommon_password', "passwords.txt")
weak_passwords = []

File.open(passwords_file, "r") do |file|
  file.each { |password| weak_passwords << password.chomp.downcase }
end

secure_password = 'ksdnb8283238hdsifuIAHSIh2iuh237728937*(#@(#*@#))'

PASSWORD_MATCHES.each do |matches|
  $matches = matches

  puts "Benchmarking for #{$matches} matches"
  Benchmark.ips do |bench|
    bench.report("password on weaklist - old with #{matches} matches") { common_passwords_old.include?(weak_passwords.sample) }
    bench.report("password on weaklist - new with #{matches} matches") { common_passwords_new.include?(weak_passwords.sample) }
    bench.report("password not on weaklist - old with #{matches} matches") { common_passwords_old.include?(secure_password) }
    bench.report("password not on weaklist - new with #{matches} matches") { common_passwords_new.include?(secure_password) }

    bench.compare!
  end
end

Results:

Benchmarking for 100 matches
Warming up --------------------------------------
password on weaklist - old with 100 matches
                        18.000  i/100ms
password on weaklist - new with 100 matches
                       439.508k i/100ms
password not on weaklist - old with 100 matches
                        19.000  i/100ms
password not on weaklist - new with 100 matches
                       749.231k i/100ms
Calculating -------------------------------------
password on weaklist - old with 100 matches
                        202.715  (± 4.4%) i/s -      1.026k in   5.072639s
password on weaklist - new with 100 matches
                          4.355M (± 2.1%) i/s -     21.975M in   5.048064s
password not on weaklist - old with 100 matches
                        203.283  (± 3.9%) i/s -      1.026k in   5.055536s
password not on weaklist - new with 100 matches
                          7.183M (± 6.0%) i/s -     35.963M in   5.028140s

Comparison:
password not on weaklist - new with 100 matches:  7182737.6 i/s
password on weaklist - new with 100 matches:  4355186.2 i/s - 1.65x  (± 0.00) slower
password not on weaklist - old with 100 matches:      203.3 i/s - 35333.65x  (± 0.00) slower
password on weaklist - old with 100 matches:      202.7 i/s - 35432.66x  (± 0.00) slower

Benchmarking for 1000 matches
Warming up --------------------------------------
password on weaklist - old with 1000 matches
                        18.000  i/100ms
password on weaklist - new with 1000 matches
                       350.866k i/100ms
password not on weaklist - old with 1000 matches
                        19.000  i/100ms
password not on weaklist - new with 1000 matches
                       729.568k i/100ms
Calculating -------------------------------------
password on weaklist - old with 1000 matches
                        194.426  (± 7.7%) i/s -    972.000  in   5.037264s
password on weaklist - new with 1000 matches
                          3.999M (± 9.4%) i/s -     19.999M in   5.051506s
password not on weaklist - old with 1000 matches
                        193.208  (± 8.3%) i/s -    969.000  in   5.063159s
password not on weaklist - new with 1000 matches
                          7.214M (± 5.2%) i/s -     36.478M in   5.070975s

Comparison:
password not on weaklist - new with 1000 matches:  7214300.7 i/s
password on weaklist - new with 1000 matches:  3998718.1 i/s - 1.80x  (± 0.00) slower
password on weaklist - old with 1000 matches:      194.4 i/s - 37105.61x  (± 0.00) slower
password not on weaklist - old with 1000 matches:      193.2 i/s - 37339.53x  (± 0.00) slower

Benchmarking for 10000 matches
Warming up --------------------------------------
password on weaklist - old with 10000 matches
                        20.000  i/100ms
password on weaklist - new with 10000 matches
                       416.675k i/100ms
password not on weaklist - old with 10000 matches
                        18.000  i/100ms
password not on weaklist - new with 10000 matches
                       728.625k i/100ms
Calculating -------------------------------------
password on weaklist - old with 10000 matches
                        199.885  (± 7.0%) i/s -      1.000k in   5.038459s
password on weaklist - new with 10000 matches
                          4.113M (± 7.5%) i/s -     20.834M in   5.100896s
password not on weaklist - old with 10000 matches
                        202.526  (± 1.5%) i/s -      1.026k in   5.067563s
password not on weaklist - new with 10000 matches
                          7.151M (± 5.1%) i/s -     35.703M in   5.007295s

Comparison:
password not on weaklist - new with 10000 matches:  7151325.7 i/s
password on weaklist - new with 10000 matches:  4112592.3 i/s - 1.74x  (± 0.00) slower
password not on weaklist - old with 10000 matches:      202.5 i/s - 35310.69x  (± 0.00) slower
password on weaklist - old with 10000 matches:      199.9 i/s - 35777.24x  (± 0.00) slower
HCLarsen commented 2 years ago

This is great. Thank you for your work on this.

HCLarsen commented 2 years ago

After merging, I've found that I've had to revert it to the previous state, as it's failing the test test_should_return_the_specified_number_of_passwords.

Please check that the entire test suite is passing before generating a PR.