SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.04k stars 139 forks source link

Added access_row_tuples_by_indexs method #463

Closed Prakriti-nith closed 6 years ago

Prakriti-nith commented 6 years ago

462

This PR aims to revert back the changes for access_row_tuples_by_indexs method.

Prakriti-nith commented 6 years ago

I tried it with multi indexes and found two different bugs which needs to be managed separately. This method will work with multi indexes only when a single multi-index is provided

  1. I think there is a bug in #pos method of Daru::MultiIndex when multiple multi-indexes are provided. access_row_tuples_by_index will work in this case when the issue with pos method will be resolved. daru1 daru2

  2. pos method is working fine in the below case but #access_row_tuples_by_indexs gives error as it uses index for creating new rows but the provided index is not actually the valid index. It is used to get the positions.

    daru3

Shekharrajak commented 6 years ago

Ping @zverok , @v0dro Please look into the PR as soon as possible.

Thanks!

paisible-wanderer commented 6 years ago

Note: in dataframe.rb, it is possible to use instead:

def access_row_tuples_by_indexs *indexes
  get_sub_dataframe(indexes, by_position: false).map_rows(&:to_a)
end

It is about 10 times slower that the current reverted code but, among the two bugs found by @Prakriti-nith , it is not affected by the second one (ie I have:

df.access_row_tuples_by_indexs [:a, :one]
=> [[11, 1], [12, 2]]

)

Shekharrajak commented 6 years ago

Thanks @paisible-wanderer for informing us. @Prakriti-nith , please use similar thing for dataframe having multi index and test it for different cases.