fastruby / fast-ruby

:dash: Writing Fast Ruby :heart_eyes: -- Collect Common Ruby idioms.
https://github.com/fastruby/fast-ruby
5.67k stars 376 forks source link

`str.count("\n")` is 1.3-170 times faster than `str.lines.count` or `str.each_line.count` depending on the string size #220

Open ilyazub opened 1 year ago

ilyazub commented 1 year ago

str.count("\n") is 1.3-170 times faster than str.lines.count or str.each_line.count (ref: https://serpapi.com/blog/lines-count-failed-deployments/). The speed difference grows with the lines count.

$ ruby tmp/string_count_benchmark.rb
Warming up --------------------------------------
  String#count('\n')    86.000  i/100ms
   String#lines.size     1.000  i/100ms
  String#lines.count     1.000  i/100ms
String#each_line.count
                         1.000  i/100ms
Calculating -------------------------------------
  String#count('\n')    771.031  (± 6.6%) i/s -      3.870k in   5.041849s
   String#lines.size      4.785  (± 0.0%) i/s -     24.000  in   5.037242s
  String#lines.count      4.513  (± 0.0%) i/s -     23.000  in   5.112095s
String#each_line.count
                          4.763  (± 0.0%) i/s -     24.000  in   5.075882s

Comparison:
  String#count('\n'):      771.0 i/s
   String#lines.size:        4.8 i/s - 161.12x  (± 0.00) slower
String#each_line.count:        4.8 i/s - 161.87x  (± 0.00) slower
  String#lines.count:        4.5 i/s - 170.86x  (± 0.00) slower

Benchmark code:

require "benchmark/ips"

HTML = "\nruby\n" * 1024 * 1024

def fastest
  HTML.count("\n")
end

def faster
  HTML.each_line.count
end

def fast
  HTML.lines.length
end

def slow
  HTML.lines.size
end

Benchmark.ips do |x|
  x.report("String#count('\\n')")     { fastest }
  x.report("String#lines.size")       { faster  }
  x.report("String#lines.count")      { fast    }
  x.report("String#each_line.count")  { slow    }
  x.compare!
end

I'd like to add this benchmark to fast-ruby. Wdyt?


Based on our updates to the @guilhermesimoes' very helpful gist: https://gist.github.com/guilhermesimoes/d69e547884e556c3dc95?permalink_comment_id=4687645#gistcomment-4687645

ilyazub commented 11 months ago

@JuanVqz @etagwerker what do you think about a benchmark for String#count vs String#lines.count vs String#each_line.count?

JuanVqz commented 11 months ago

This seems a Rails related benchmark, I wonder if we are adding framework-related benchmarks

ilyazub commented 11 months ago

It doesn't depend on Rails (https://github.com/ruby/ruby/pull/4001#issuecomment-1715783493). I updated the benchmark code to work with the plain Ruby.

ilyazub commented 11 months ago

@JuanVqz I updated the benchmark code above to work with the plain Ruby.

What do you think?

ixti commented 10 months ago

IMHO using $/ is bad - it's less obvious than "\n". More than that, $/ can be redefined, so the code may become broken.

ilyazub commented 10 months ago

@ixti Sounds good. The performance difference is in the String#count vs other methods. $/ vs \n did't impact performance here.