SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.04k stars 139 forks source link

Refactor apply_method in GroupBy #488

Closed paisible-wanderer closed 4 years ago

paisible-wanderer commented 5 years ago

Hello!

This is a (minor) PR in the continuation of 464, best reviewed commit by commit.

The speed is mostly unchanged (see benchmark 2/ in the section "stats at final commit" of the previous PR).

It does fix a minor bug:

gr   = Daru::Vector.new([:a,:a,:a, :b])
vals = Daru::Vector.new([1,1,1,753])

Daru::DataFrame.new({gr: gr, vals: vals}).group_by(:gr).std

## Before
# => #<Daru::DataFrame(2x1)>
#       vals
#     a  0.0
#     b  753

## After
# => #<Daru::DataFrame(2x1)>
#       vals
#     a  0.0
#     b  NaN

# it was wrong because the (unbiased) std is not defined for a vector of size 1:
Daru::Vector.new([753]).std
# => NaN

The error was the line slice.is_a?(Daru::Vector) ? slice.send(method) : slice (ok for all the other cases, but not for std).

I also have renamed a method and added some deprecation code.

Please, let me know if there is anything I can improve.

v0dro commented 4 years ago

@Shekharrajak tests not needed since its a refactor and current tests are passing.