Open wesm opened 7 years ago
I agree, keeping around Panel as a simple data container could make sense. I have also found it to be useful as an intermediate data structure for easier data alignment, though I can't think of particular use cases off the top of my head.
CC @MaximilianR
I don't have a strong view.
xarray is pretty good for aligning! So I predominately use that:
In [5]: df = pd.DataFrame(np.random.rand(3,4), columns=list('abcd'))
In [6]: df
Out[6]:
a b c d
0 0.164063 0.014835 0.529693 0.268561
1 0.076066 0.598840 0.887823 0.566114
2 0.599438 0.021646 0.775174 0.959695
In [7]: xr.Dataset({'first': df, 'second': df[list('ab')]})
Out[7]:
<xarray.Dataset>
Dimensions: (dim_0: 3, dim_1: 4)
Coordinates:
* dim_0 (dim_0) int64 0 1 2
* dim_1 (dim_1) object 'a' 'b' 'c' 'd'
Data variables:
second (dim_0, dim_1) float64 0.1641 0.01483 nan nan 0.07607 0.5988 ...
first (dim_0, dim_1) float64 0.1641 0.01483 0.5297 0.2686 0.07607 ...
And pandas' stack / unstacking is pretty good for swapping axes.
What's the use case where you'd need functionality in pandas?
we should consider the API that will replace the current to_panel and to_frame workflows
@jreback has built some good .to_xarray
, and we've built some decent (not perfect yet) coercion by passing xarray & pandas objects into each others' constructors
this is merged: https://github.com/pandas-dev/pandas/pull/15601
so can think about this (at some point).
The most common use case for panels I've seen has been as an aligning container for data frames -- you can insert a DataFrame "item" as you would a column normally. This can alleviate some awkwardness when working with multi-indexed data.
Couple questions around panels:
If we drop Panel as an analytical data structure (i.e. what is currently offered by the NDFrame construct), we should consider the API that will replace the current
to_panel
andto_frame
workflowsIt may be worthwhile to consider keeping around Panel as a simple container data structure for maintaining a related collection of DataFrames and supporting rudimentary reshaping / axis-swapping functionality. For example, if you have a dict of DataFrame objects in some orientation, you could create a panel, swap axes, then convert to some other data structure (e.g. xarray, MultiIndex-ed DataFrame). If you want to do deeper analysis, you should convert to xarray.
In either case, we'd be eliminating a bunch of thinly supported code