fastruby / fast-ruby

:dash: Writing Fast Ruby :heart_eyes: -- Collect Common Ruby idioms.
https://github.com/fastruby/fast-ruby
5.67k stars 376 forks source link

Comparing array subset checks #125

Open gabteles opened 7 years ago

mblumtritt commented 7 years ago

Your code above is massively dependent to used sample arrays. Using this samples

ARRAY1 = [*(400..500)].freeze
ARRAY2 = [*(1..500)].freeze

will change your results completely. I would prefer to use a more real-world sample for the test to avoid misleadings…

Arcovion commented 7 years ago

Maybe (ARRAY1 & ARRAY2).size == ARRAY1.size is faster - also try the set library; convert both #to_set and check with #subset?

gabteles commented 7 years ago

@mblumtritt What you suggest as a real world example? It would be nice if I generate a random-filled array of integers and let it static in the test or something like that?

@Arcovion Nice, I'll add both.

Thank you guys. (:

mblumtritt commented 7 years ago

@gabteles Do you mean something like this?

ARRAY1 = [*(400..500)].shuffle.freeze
ARRAY2 = [*(1..500)].shuffle.freeze

Then we have to other problem: how is the weight between the set and the subset 🤔 Or do you prefer a complete random set to compare?…

In "real world" I would check my chances to be faster with a complete different algorithm. For sample what about sorting first descending? This might be faster if both sets already nearly sorted…

I'm unsure to give a "general advice" for this subset problem. In "real world" it always depends… ;)

gabteles commented 7 years ago

@mblumtritt Yeah, I agree that the weight between set/subset is a big problem to benchmark. I'm not sure how to handle it.

Before I meant something like this:


Using it statically grants that we won't have performance improved/degraded by randomness effect of shuffle or rand in runtime. But it also suffers of the weight problem.