fastruby / fast-ruby

:dash: Writing Fast Ruby :heart_eyes: -- Collect Common Ruby idioms.
https://github.com/fastruby/fast-ruby
5.67k stars 376 forks source link

Add String#start_with? vs String#[].== #112

Closed jonathanhefner closed 8 years ago

jonathanhefner commented 8 years ago

Compares e.g. url.start_with?("http") vs url[0...4] == "http"

A lot of the existing string benchmarks list their "slow" case first, but the contributing guidelines said to list the "fast" case first. If I need to change anything, I would be happy to. Thanks!

jonathanhefner commented 8 years ago

Based on the conversations in #96, I tried a different approach to these benchmarks:

require 'benchmark/ips'

PREFIX = '_'
STRINGS = (0..99).map{|n| "#{PREFIX if n.odd?}#{n}" }

FAST = STRINGS.each_index.map do |i|
  "STRINGS[#{i}].start_with?(PREFIX)"
end.join(';')

SLOW_LENGTH = STRINGS.each_index.map do |i|
  # use `eql?` instead of `==` to prevent warnings
  "STRINGS[#{i}][0, PREFIX.length].eql?(PREFIX)"
end.join(';')

SLOW_RANGE = STRINGS.each_index.map do |i|
  # use `eql?` instead of `==` to prevent warnings
  "STRINGS[#{i}][0...PREFIX.length].eql?(PREFIX)"
end.join(';')

Benchmark.ips do |x|
  x.report('String#start_with?', FAST)
  x.report('String#[0, n] ==', SLOW_LENGTH)
  x.report('String#[0...n] ==', SLOW_RANGE)
  x.compare!
end

The code is less clear, but the results show a sharper contrast:

$ ruby -v code/string/start_with-vs-substring-\=\=.rb
ruby 2.2.4p230 (2015-12-16 revision 53155) [i386-cygwin]

Calculating -------------------------------------
  String#start_with?     76.406k (± 0.3%) i/s -    387.296k in   5.068975s
    String#[0, n] ==     33.604k (± 0.2%) i/s -    169.364k in   5.039945s
   String#[0...n] ==     21.865k (± 0.3%) i/s -    111.176k in   5.084763s

Comparison:
  String#start_with?:    76405.7 i/s
    String#[0, n] ==:    33604.4 i/s - 2.27x slower
   String#[0...n] ==:    21864.8 i/s - 3.49x slower

If desired, I can update this PR using the above code. Thoughts?

nirvdrum commented 8 years ago

While I guess it's a question of what exactly you're trying to benchmark, note that in the SLOW_LENGTH case you're performing an extra string length calculation on every call and in the SLOW_RANGE case you're making that same extra string length calculation and constructing a Range object.

The code as written is probably representative of what most people would do in the real world. But if your focus is on measuring the cost of the various prefix checks themselves, it'd probably be worthwhile to calculate the values once outside the loop and then substitute them in.

jonathanhefner commented 8 years ago

The code as written is probably representative of what most people would do in the real world.

I'm glad you see it as such--that's exactly what I intended. :) Specifically, I've seen this pattern a few times:

if template_name.first == "_"
  # handle partial view template
end

This uses Active Support's String#first method, which is implemented just like slow_range, including the length retrieval and range allocation.

jonathanhefner commented 8 years ago

I've modified the commit to include the changes discussed above. I've included a case with a pre-allocated range, as per @nirvdrum's suggestion. I've also renamed the cases as per the discussion in #110. Thank you for the feedback. :)

JuanitoFatas commented 8 years ago

@nirvdrum Thank you for the review! @jonathanhefner Thanks for your awesome work!

@nirvdrum @jonathanhefner May I add you as collaborators to this repository?

jonathanhefner commented 8 years ago

@JuanitoFatas Sure thing! Thanks very much! :smiley: :tada:

nirvdrum commented 8 years ago

@JuanitoFatas That'd be fantastic. Thanks!