SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

How to reset filtered dataframe index to 0-N? #466

Open info-rchitect opened 5 years ago

info-rchitect commented 5 years ago

Hi,

I use dataframes that are typically indexed from 0 to N with no special need to index with non-numeric labels. So, when I filter a dataframe, the index then gets 'screwed up', aka non-sequential. This makes adding in new vectors a pain because the array of data has to have the same indices as the dataframe index or the data will be placed in the wrong spot. The reindex method does not do what I want, it inserts nils if the previous row of data that I filtered got removed. I just want the new filtered dataframe to be indexed from 0-N (the new, smaller N).

I get around this currently by the following sequence:

  1. Filter the dataframe
  2. Export the filtered dataframe to a hash (a hash where the values are an Array and not a Vector)
  3. Re-create the dataframe using the exported hash

This results in the 0-N index I am looking for.

EDIT

The same behavior would also be great for the sort method. I don't care what the original index was after sorting, I just want to be able to join in data from other dataframes or vectors without worrying about the index. If the other vector is the same size as the dataframe (nrows) then just add the data in from row 0-N.

thx