SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.03k stars 139 forks source link

set_index fails if data_frame size is zero #530

Open dipanm opened 4 years ago

dipanm commented 4 years ago

If we create a row data frame with only 1 row, but it has multi-index it works

> df1 = Daru::DataFrame.new( { 'Beer' => ['Kingfisher'] , 'Gallons sold' => [500 ] }, :index  => [[ 'Asia', 'india'] ] )
 => #<Daru::DataFrame(1x2)>
                             Beer Gallons so
       Asia      india Kingfisher        500 

If we have a data frame with only 1 row, and if we try to raise a single index it also works.

> df2 = Daru::DataFrame.new( { 'Beer' => ['Kingfisher'] , 'Gallons sold' => [500 ], 'Country' => [ 'india' ], 'Continent' => ['Asia'] })
 => #<Daru::DataFrame(1x4)>
                  Beer Gallons so    Country  Continent
          0 Kingfisher        500      india       Asia 
 > df2.set_index('Continent')
 => #<Daru::DataFrame(1x3)>
                  Beer Gallons so    Country
       Asia Kingfisher        500      india 

However, if we try to set a multi-index when the data frame size is 1, it fails deterministically

> df3 = Daru::DataFrame.new( { 'Beer' => ['Kingfisher'] , 'Gallons sold' => [500 ], 'Country' => [ 'india' ], 'Continent' => ['Asia'] })
 => #<Daru::DataFrame(1x4)>
                  Beer Gallons so    Country  Continent
          0 Kingfisher        500      india       Asia 

> df3.set_index( ['Continent', 'Country'] )
Traceback (most recent call last):
        5: from /Users/Dipan/.rvm/rubies/ruby-2.6.3/bin/irb:23:in `<main>'
        4: from /Users/Dipan/.rvm/rubies/ruby-2.6.3/bin/irb:23:in `load'
        3: from /Users/Dipan/.rvm/rubies/ruby-2.6.3/lib/ruby/gems/2.6.0/gems/irb-1.0.0/exe/irb:11:in `<top (required)>'
        2: from (irb):11
        1: from /Users/Dipan/.rvm/gems/ruby-2.6.3/gems/daru-0.2.2/lib/daru/dataframe.rb:1577:in `set_index'

Trying to add indexes one by one also fails.