SciRuby / daru

Data Analysis in RUby
BSD 2-Clause "Simplified" License
1.04k stars 139 forks source link

added name attr in Index #323

Closed Shekharrajak closed 7 years ago

Shekharrajak commented 7 years ago

Fixes https://github.com/SciRuby/daru/issues/223

Example :

irb(main):001:0> d = Daru::Index.new [:one, 'one', 1, 2, :two]
=> #<Daru::Index (5): {one, one, 1, 2, two}>
irb(main):002:0> d.name = "index_name"
=> "index_name"
irb(main):003:0> d
=> #<Daru::Index index_name(5): {one, one, 1, 2, two}>
irb(main):004:0> d = Daru::Index.new [:one, 'one', 1, 2, :two], name: "index_name"
=> #<Daru::Index index_name(5): {one, one, 1, 2, two}>
irb(main):005:0> d
=> #<Daru::Index index_name(5): {one, one, 1, 2, two}>

irb(main):167:0> mi = Daru::MultiIndex.new(
irb(main):168:1* levels: [[:a,:b,:c], [:one, :two]],
irb(main):169:1* labels: [[0,0,1,1,2,2], [0,1,0,1,0,1]], name: ['s1', 's2'])
=> #<Daru::MultiIndex(6x2)>
  s1  s2
   a one
     two
   b one
     two
   c one
     two
irb(main):170:0> mi.name = ['k1', 'k2']
=> ["k1", "k2"]
irb(main):171:0> mi
=> #<Daru::MultiIndex(6x2)>
  k1  k2
   a one
     two
   b one
     two
   c one
     two

irb(main):172:0> mi.name
=> ["k1", "k2"]

#dataframe

irb(main):045:0> d2 = Daru::Index.new [100, 99, 101, 1, 2], name: "s1"
=> #<Daru::Index: s1(5): {100, 99, 101, 1, 2}>
irb(main):046:0> df = Daru::DataFrame.new({b: [11,12,13,14,15], a: [1,2,3,4,5],
irb(main):047:2*   c: [11,22,33,44,55]},
irb(main):048:1*   order: [:a, :b, :c],
irb(main):049:1*   index: d2)
=> #<Daru::DataFrame(5x3)>
  s1   a   b   c
 100   1  11  11
  99   2  12  22
 101   3  13  33
   1   4  14  44
   2   5  15  55

# multi index 
irb(main):034:0> mi =Daru::MultiIndex.new(
irb(main):035:1*                 levels: [[:a,:b,:c], [:one, :two]],
irb(main):036:1*                 labels: [[0,0,1,1,2,2], [0,1,0,1,0,1]], name: ['s1', 's2'])
=> #<Daru::MultiIndex(6x2)>
  s1  s2
   a one
     two
   b one
     two
   c one
     two
irb(main):037:0> df =Daru::DataFrame.new({
irb(main):038:2*           a: [11, 12, 13, 14, 15, 16], b: [21, 22, 23, 24, 25, 26]},
irb(main):039:1*             name: 'test', index: mi)
=> #<Daru::DataFrame: test (6x2)>
  s1  s2   a   b
   a one  11  21
     two  12  22
   b one  13  23
     two  14  24
   c one  15  25
     two  16  26
Shekharrajak commented 7 years ago

Now in dataframe we can see Index name. I have added the testcases for it.

v0dro commented 7 years ago

Add documentation to all modified user methods specifying that a name attribute can be used with index. Rest looks good (apart from that specs comment).

Shekharrajak commented 7 years ago

I think now this PR is good to go.

Shekharrajak commented 7 years ago

@zverok , I have added testcases when index name is shorter/longer than index levels in new commit.

These examples shows the MultiIndex behavior if the index name is shorter/longer than index levels :

irb(main):002:0* mi =Daru::MultiIndex.new(
irb(main):003:1*       levels: [[:a,:b,:c], [:one, :two]],
irb(main):004:1*       labels: [[0,0,1,1,2,2], [0,1,0,1,0,1]], name: ['s1', 's2'])
=> #<Daru::MultiIndex(6x2)>
  s1  s2
   a one
     two
   b one
     two
   c one
     two
irb(main):005:0> 
irb(main):006:0* mi =Daru::MultiIndex.new(
irb(main):007:1*       levels: [[:a,:b,:c], [:one, :two]],
irb(main):008:1*       labels: [[0,0,1,1,2,2], [0,1,0,1,0,1]], name: ['s1'])
SizeError: names and levels should be of same size. size of the name array is 1 and size of the MultiIndex levels and labels is 2.
If you don't want to set name for particular level(say level `i`) then put empty string on index `i` of the `name` Array.

irb(main):009:0> 
irb(main):010:0* mi =Daru::MultiIndex.new(
irb(main):011:1*       levels: [[:a,:b,:c], [:one, :two]],
irb(main):012:1*       labels: [[0,0,1,1,2,2], [0,1,0,1,0,1]], name: ['s1', 's2', 's3'])
SizeError: names and levels should be of same size. size of the name array is 3 and size of the MultiIndex levels and labels is 2.

irb(main):013:0> mi
=> #<Daru::MultiIndex(6x2)>
  s1  s2
   a one
     two
   b one
     two
   c one
     two
irb(main):014:0> mi.name = ['k1']
SizeError: names and levels should be of same size. size of the name array is 1 and size of the MultiIndex levels and labels is 2.
If you don't want to set name for particular level(say level `i`) then put empty string on index `i` of the `name` Array.

irb(main):015:0> mi.name = ['k1', 'k2']
=> ["k1", "k2"]
irb(main):016:0> mi
=> #<Daru::MultiIndex(6x2)>
  k1  k2
   a one
     two
   b one
     two
   c one
     two
irb(main):017:0> mi.name = ['k1', 'k2', 'k3']
SizeError: names and levels should be of same size. size of the name array is 3 and size of the MultiIndex levels and labels is 2.

irb(main):018:0> mi.name = ['k1', '']
=> ["k1", ""]
irb(main):019:0> mi
=> #<Daru::MultiIndex(6x2)>
  k1    
   a one
     two
   b one
     two
   c one
     two
irb(main):020:0> mi.name = ['', 'k2']
=> ["", "k2"]
irb(main):021:0> mi
=> #<Daru::MultiIndex(6x2)>
      k2
   a one
     two
   b one
     two
   c one
     two
Shekharrajak commented 7 years ago

Travis is still showing random error .

athityakumar commented 7 years ago

@Shekharrajak - Travis isn't showing random error (s). Apparently, rubocop released version 0.48.1 which has better offense detection than 0.47.0 - and hence, the build is failing on master. This issue has been fixed on PR #329 - so you don't have to fix the rubocop offenses again. Build will be successful on master, once that PR has been merged. :smile:

Shekharrajak commented 7 years ago

Thanks @athityakumar .

Shekharrajak commented 7 years ago

Checks passed! I hope, now PR is ready for review.

zverok commented 7 years ago

Thanks for your work! :tada: