fastruby / fast-ruby

:dash: Writing Fast Ruby :heart_eyes: -- Collect Common Ruby idioms.
https://github.com/fastruby/fast-ruby
5.67k stars 376 forks source link

Add String#sub! vs String#[]= #41

Closed sshaw closed 9 years ago

sshaw commented 9 years ago

Note that String#[]= raises an IndexError if the string does not contain the replacement target

s.dup[src]= is also faster then s.sub, was going to add that too but...

May be nice to add a note to the README about running your rake task (or a single fast-ruby script) against multiple rubies with RVM's do command, e.g., rvm 1.9.3,2.2.1,1.7.19 do rake.

JuanitoFatas commented 9 years ago

Thanks for this :heart_eyes:

Do you think this could add to this benchmark (#[] with Regular Expression)?

s[%r{http://}] = ""

Also do you think is worth mentioning gsub! to let people know how slow it is compared to others?

sshaw commented 9 years ago

Do you think this could add to this benchmark (#[] with Regular Expression)?

Good idea. Added.

Also do you think is worth mentioning gsub! to let people know how slow it is compared to others?

What do you mean, something like "checkout the benchmarks for gsub! (with link)" in the String#[]= section?

JuanitoFatas commented 9 years ago

What do you mean, something like "checkout the benchmarks for gsub! (with link)" in the String#[]= section?

I think some people will just use gsub when they only need sub so it is good to include in benchmark to let them have impression that gsub is slower than sub, also if you don't mind I want to use double quotes :blush: :

require "benchmark/ips"

URL = "http://www.thelongestlistofthelongeststuffatthelongestdomainnameatlonglast.com/wearejustdoingthistobestupidnowsincethiscangoonforeverandeverandeverbutitstilllookskindaneatinthebrowsereventhoughitsabigwasteoftimeandenergyandhasnorealpointbutwehadtodoitanyways.html"

def fast
  s = URL.dup
  s["http://"] = ""
end

def slow_1
  s = URL.dup
  s.sub! "http://", ""
end

def slow_2
  s = URL.dup
  s.gsub! "http://", ""
end

def slow_3
  s = URL.dup
  s[%r{http://}] = ""
end

def slow_4
  s = URL.dup
  s.sub! %r{http://}, ""
end

def slow_5
  s = URL.dup
  s.gsub! %r{http://}, ""
end

Benchmark.ips do |x|
  x.report("String#['string']=")   { fast   }
  x.report("String#sub!'string'")  { slow_1 }
  x.report("String#gsub!'string'") { slow_2 }
  x.report("String#[/regexp/]=")   { slow_3 }
  x.report("String#sub!/regexp/")  { slow_4 }
  x.report("String#gsub!/regexp/") { slow_5 }
  x.compare!
end
sshaw commented 9 years ago

I think some people will just use gsub when they only need sub so it is good to include in benchmark to let them have impression that gsub is slower than sub,

Yes definitely, I see this a lot :scream_cat:

also if you don't mind I want to use double quotes

Sure, I love um. I used singe cause that's what a lot (most) of the code uses. So does the template in the README.

sshaw commented 9 years ago

Updated with gsub and double quotes.

JuanitoFatas commented 9 years ago

Merged manually in e97f913.

Sure, I love um. I used singe cause that's what a lot (most) of the code uses. So does the template in the README.

Thanks! Changed :stuck_out_tongue_closed_eyes:

And thanks for your contribution!!! :kissing_heart:

mblumtritt commented 9 years ago

You should add a remark that the sub-string which you want to replace must exist in the source string when String#['string']= is used. This makes this method very different to all others - and not really compareble, or?

sshaw commented 9 years ago

You should add a remark that the sub-string which you want to replace must exist in the source string.. This makes this method very different to all others - and not really compareble, or?

I pondered this in my original comment and think it's a valid point. Considering that performance may be worse if one has to first check for substring existence or deal with exception handling.

On the other hand faster-ruby does not acknowledge the preconditions (or side effects) one considers when writing production code. They just tell you what is faster, for example: block.call vs yield - should they include if block.nil? or if block_given?

At the very least a comment about IndexError should be added to the README or code. @JuanitoFatas what do you think?

mblumtritt commented 9 years ago

Sorry that I missed the original comment. But I would prefer to make it clear for the readers/users if there are different preconditions and side effects. My experience is that there a lot of users out there which will only copy the fastest solution "because that was found in this cool experts banchmark" :flushed: @JuanitoFatas: please add such a comment somehow!

BTW: the yield/block sample is compareable because they need same preconditions and will fail in same cases (but with different exception classes).

My fastest solution for string replacement is this helper method:

def replace(str, fnd, repl)
   idx = str.index(fnd) and str[idx, fnd.size] = repl
   str
end
JuanitoFatas commented 9 years ago

You should add a remark that the sub-string which you want to replace must exist in the source string when String#['string']= is used. This makes this method very different to all others - and not really compareble, or?

I concur with @mblumtritt, added a note in https://github.com/JuanitoFatas/fast-ruby/commit/1ecd378d535c720be49923e86a5f6de336624eb0.

Thank you for bring this up @mblumtritt, really appreciate it! And thanks @sshaw for helping out.

Cheers. :beers:

mblumtritt commented 9 years ago

:+1: