Open toobaz opened 7 years ago
@toobaz : I think this makes sense to me. Why would you expect the shape to be same if you transposed?
@toobaz - agree with your diagnosis this is most likely due to empty-string level dropping magic, xref #11424. Probably could be made consistent.
@chris-b1 : Judging from your response, I'm labeling this as an API issue. I'm not sure I follow the expected out description by @toobaz . Could you explain?
Sure, our basic behavior is that indexing operations that are "slice-like" (e.g. selecting an entire level) on a MultiIndex
return back a DataFrame
. Couple examples:
In [4]: idx = pd.MultiIndex.from_tuples([('a', ''), ('b', '1'), ('c', '1'), ('c', '2')])
In [5]: df = pd.DataFrame(np.arange(16).reshape(4,4), index=idx, columns=idx)
In [6]: df
Out[6]:
a b c
1 1 2
a 0 1 2 3
b 1 4 5 6 7
c 1 8 9 10 11
2 12 13 14 15
In [7]: type(df.loc['b', :])
Out[7]: pandas.core.frame.DataFrame
In [8]: type(df.loc['c', :])
Out[8]: pandas.core.frame.DataFrame
In [9]: type(df.loc[:, 'b'])
Out[9]: pandas.core.frame.DataFrame
In [10]: type(df.loc[:, 'c'])
Out[10]: pandas.core.frame.DataFrame
But, as an undocumented "convenience" feature (linked issue), if the selection is on the columns, and all deeper levels are labeled with empty strings, the selection collapses into a Series
- this collapsing doesn't happen with a row selection (this issue)
In [12]: df.loc[:, 'b']
Out[12]:
1
a 1
b 1 5
c 1 9
2 13
In [13]: df.loc[:, 'a']
Out[13]:
a 0
b 1 4
c 1 8
2 12
Name: a, dtype: int32
In [16]: type(df.loc[:, 'a'])
Out[16]: pandas.core.series.Series
In [17]: df.loc['a', :]
Out[17]:
a b c
1 1 2
0 1 2 3
In [18]: type(df.loc['a', :])
Out[18]: pandas.core.frame.DataFrame
@chris-b1 : Awesome! That definitely explained it and then some. I think I got confused by the description of the expected output. The expected shape is just the dimensions reversed (it's a transposition).
The expected shape is just the dimensions reversed (it's a transposition).
My example was maybe a bit cryptic, sorry. The thing is that a shape (1,2)
when transposed gives (2,1)
, not (2,)
.
Code Sample, a copy-pastable example if possible
Problem description
Maybe the "fill an incomplete key with empty string(s)" rule is not implemented at all for rows? (also in light of #17024 ) If this the case, then I think it should be.
Expected Output
The same as
Out[4]
but reversed.Output of
pd.show_versions()