Closed femtotrader closed 8 years ago
An example with Pandas DataReader
In [1]: import pandas_datareader as pdr
In [2]: panel = pdr.DataReader(["GOOG", "IBM"], "yahoo")
In [3]: panel
Out[3]:
<class 'pandas.core.panel.Panel'>
Dimensions: 6 (items) x 1630 (major_axis) x 2 (minor_axis)
Items axis: Open to Adj Close
Major_axis axis: 2010-01-04 00:00:00 to 2016-06-23 00:00:00
Minor_axis axis: GOOG to IBM
In [6]: import pandas as pd
In [7]: pd.set_option("max_rows", 10)
In [8]: panel.loc["Close"]
Out[8]:
GOOG IBM
Date
2010-01-04 626.751061 132.449997
2010-01-05 623.991055 130.850006
2010-01-06 608.261023 130.000000
2010-01-07 594.101005 129.550003
2010-01-08 602.021036 130.850006
... ... ...
2016-06-17 691.719971 151.990005
2016-06-20 693.710022 153.610001
2016-06-21 695.940002 154.050003
2016-06-22 697.460022 152.919998
2016-06-23 701.869995 155.350006
[1630 rows x 2 columns]
In [11]: panel.loc[:,:,"GOOG"]
Out[11]:
Open High Low Close Volume \
Date
2010-01-04 626.951088 629.511067 624.241073 626.751061 3927000.0
2010-01-05 627.181073 627.841071 621.541045 623.991055 6031900.0
2010-01-06 625.861078 625.861078 606.361042 608.261023 7987100.0
2010-01-07 609.401025 610.001045 592.651008 594.101005 12876600.0
2010-01-08 592.000997 603.251034 589.110988 602.021036 9483900.0
... ... ... ... ... ...
2016-06-17 708.650024 708.820007 688.452026 691.719971 3396800.0
2016-06-20 698.770020 702.479980 693.409973 693.710022 2071400.0
2016-06-21 698.400024 702.770020 692.010010 695.940002 1464300.0
2016-06-22 699.059998 700.859985 693.081970 697.460022 1179300.0
2016-06-23 697.450012 701.950012 687.000000 701.869995 2157400.0
Adj Close
Date
2010-01-04 313.062468
2010-01-05 311.683844
2010-01-06 303.826685
2010-01-07 296.753749
2010-01-08 300.709808
... ...
2016-06-17 691.719971
2016-06-20 693.710022
2016-06-21 695.940002
2016-06-22 697.460022
2016-06-23 701.869995
[1630 rows x 6 columns]
Here is a quick idea :
import six
import collections
import Quandl
def myget(datasets, **kwargs):
if isinstance(datasets, six.string_types):
return Quandl.get(datasets, **kwargs)
else:
d = collections.OrderedDict()
for dataset in datasets:
d[dataset] = Quandl.get(dataset, **kwargs)
columns = list(d[dataset].columns)
return d # returns OrderedDict of DataFrame
In [20]: d = myget(["CHRIS/CME_ES1", "CHRIS/CME_NQ1"])
In [21]: panel = pd.Panel.from_dict(d).transpose(2,1,0)
In [22]: panel
Out[22]:
<class 'pandas.core.panel.Panel'>
Dimensions: 8 (items) x 4813 (major_axis) x 2 (minor_axis)
Items axis: Open to Open Interest
Major_axis axis: 1997-09-09 00:00:00 to 2016-06-23 00:00:00
Minor_axis axis: CHRIS/CME_ES1 to CHRIS/CME_NQ1
In [23]: panel["Open"]
Out[23]:
CHRIS/CME_ES1 CHRIS/CME_NQ1
Date
1997-09-09 934.00 NaN
1997-09-10 934.00 NaN
1997-09-11 916.00 NaN
1997-09-12 908.00 NaN
1997-09-15 925.00 NaN
... ... ...
2016-06-17 2077.75 4423.25
2016-06-20 2071.00 4383.00
2016-06-21 2079.00 4398.50
2016-06-22 2079.25 4397.25
2016-06-23 2083.00 4409.75
[4813 rows x 2 columns]
In [24]: panel[:,:,"CHRIS/CME_ES1"]
Out[24]:
Open High Low Last Change Settle Volume \
Date
1997-09-09 934.00 942.00 933.00 934.00 NaN 934.00 7034.0
1997-09-10 934.00 935.00 915.00 915.00 NaN 915.00 11387.0
1997-09-11 916.00 918.00 900.00 908.00 NaN 908.00 2523.0
1997-09-12 908.00 926.00 904.00 924.00 NaN 924.00 928.0
1997-09-15 925.00 930.00 920.00 922.00 NaN 922.00 208.0
... ... ... ... ... ... ... ...
2016-06-17 2077.75 2083.00 2073.75 2078.50 NaN 2079.12 52547.0
2016-06-20 2071.00 2092.50 2070.25 2082.75 15.25 2074.25 1592550.0
2016-06-21 2079.00 2086.00 2074.00 2079.25 6.25 2080.50 1394626.0
2016-06-22 2079.25 2091.25 2075.25 2077.00 3.75 2076.75 1402829.0
2016-06-23 2083.00 2113.25 2083.00 2113.25 29.00 2105.75 1308321.0
Open Interest
Date
1997-09-09 1109.0
1997-09-10 2325.0
1997-09-11 2549.0
1997-09-12 2163.0
1997-09-15 2107.0
... ...
2016-06-17 607345.0
2016-06-20 2774441.0
2016-06-21 2781532.0
2016-06-22 2769327.0
2016-06-23 2779028.0
[4813 rows x 8 columns]
In [30]: ds = xarray.Dataset(d).to_array()
In [31]: ds = ds.rename({'variable': 'Dataset'})
In [32]:
Out[31]:
<xarray.DataArray (Dataset: 2, Date: 4813, dim_1: 8)>
array([[[ 9.34000000e+02, 9.42000000e+02, 9.33000000e+02, ...,
9.34000000e+02, 7.03400000e+03, 1.10900000e+03],
[ 9.34000000e+02, 9.35000000e+02, 9.15000000e+02, ...,
9.15000000e+02, 1.13870000e+04, 2.32500000e+03],
[ 9.16000000e+02, 9.18000000e+02, 9.00000000e+02, ...,
9.08000000e+02, 2.52300000e+03, 2.54900000e+03],
...,
[ 2.07900000e+03, 2.08600000e+03, 2.07400000e+03, ...,
2.08050000e+03, 1.39462600e+06, 2.78153200e+06],
[ 2.07925000e+03, 2.09125000e+03, 2.07525000e+03, ...,
2.07675000e+03, 1.40282900e+06, 2.76932700e+06],
[ 2.08300000e+03, 2.11325000e+03, 2.08300000e+03, ...,
2.10575000e+03, 1.30832100e+06, 2.77902800e+06]],
[[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
[ nan, nan, nan, ...,
nan, nan, nan],
...,
[ 4.39850000e+03, 4.41575000e+03, 4.38825000e+03, ...,
4.40025000e+03, 1.93001000e+05, 2.21390000e+05],
[ 4.39725000e+03, 4.43400000e+03, 4.39100000e+03, ...,
4.39450000e+03, 1.88817000e+05, 2.18161000e+05],
[ 4.40975000e+03, 4.47925000e+03, 4.40800000e+03, ...,
4.46250000e+03, 1.94779000e+05, 2.13504000e+05]]])
Coordinates:
* Date (Date) datetime64[ns] 1997-09-09 1997-09-10 1997-09-11 ...
* dim_1 (dim_1) object 'Open' 'High' 'Low' 'Last' 'Change' 'Settle' ...
* Dataset (Dataset) <U13 'CHRIS/CME_ES1' 'CHRIS/CME_NQ1'
Hi @femtotrader,
Thanks for your suggestions. This is an interesting feature suggestion. This is the first time we have seen this type of feature request. For now we will tag it on the next major milestone and decide whether there is a high demand for this type of feature when planning our next major release.
Hello,
returns a DataFrame
Isn't there a way to get a Pandas Panel or an xarray.DataArray
An other approach could be to simply return an ordered dict with DataFrame as "value" and ticker as key.
User will be responsible for converting to 3D datastructure.
Kind regards