Closed AlexKnowsIt closed 2 years ago
This is the whole Error-message:
/usr/local/lib/python3.7/dist-packages/deepdow/utils.py in raw_to_Xy(raw_data, lookback, horizon, gap, freq, included_assets, included_indicators, use_log)
275 y_list = []
276 for ind in indicators:
--> 277 X, timestamps, y = (returns_to_Xy(returns.xs(ind, axis=1, level=1),
278 lookback=lookback,
279 horizon=horizon,
/usr/local/lib/python3.7/dist-packages/pandas/core/generic.py in xs(self, key, axis, level, drop_level)
3471 if not isinstance(labels, MultiIndex):
3472 raise TypeError("Index must be a MultiIndex")
-> 3473 loc, new_ax = labels.get_loc_level(key, level=level, drop_level=drop_level)
3474
3475 # create the tuple of the indexer
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/multi.py in get_loc_level(self, key, level, drop_level)
2883 return indexer, maybe_mi_droplevels(indexer, ilevels, drop_level)
2884 else:
-> 2885 indexer = self._get_level_indexer(key, level=level)
2886 return indexer, maybe_mi_droplevels(indexer, [level], drop_level)
2887
/usr/local/lib/python3.7/dist-packages/pandas/core/indexes/multi.py in _get_level_indexer(self, key, level, indexer)
2974 if not locs.any():
2975 # The label is present in self.levels[level] but unused:
-> 2976 raise KeyError(key)
2977 return locs
2978
KeyError: 'Open'
Hey! Thank you for your interest!
Hmmm, I am not sure where it comes from. Could you please upload your raw_df
here? Ideally just a minimal version of it that reproduces the error.
Cheers
Hello Jan,
thanks for helping me out. I am not sure if this is the best way to share my DataFrame, if you need it in a different format just tell me. Its the result of the yfinance package with the function
mein_portfolio = yf.download(tickers = ticker,
start="2010-01-01",
end="2021-01-01",
interval = "1d",
# group by ticker (to access via data['SPY'])
# (optional, default is 'column')
group_by = 'ticker',
)
<!--StartFragment-->
| 1COV.DE | MRK.DE | LIN.DE | FRE.DE | DWNI.DE | BEI.DE | MUV2.DE | DPW.DE | ... | BAS.DE | VOW3.DE | SIE.DE | CON.DE | VNA.DE | RWE.DE | HEN3.DE | ALV.DE
-- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | -- | --
NaN | NaN | NaN | NaN | NaN | 27.215351 | 27.428233 | 27.077607 | 27.219526 | 752504.0 | 39.533226 | 39.533226 | 39.533226 | 39.533226 | 0.0 | 12.068662 | 12.510301 | 12.068662 | 12.393932 | 96522.0 | 4.828712 | 5.049168 | 4.800266 | 5.006499 | 193278.0 | 39.338344 | 39.998988 | 39.295446 | 39.848843 | 336203.0 | 62.419827 | 62.733922 | 61.677412 | 61.848740 | 873193.0 | 8.497876 | 8.767550 | 8.497876 | 8.767550 | 3020790.0 | ... | 25.048560 | 25.849699 | 24.985161 | 25.849699 | 3245626.0 | 49.865665 | 50.309594 | 48.564486 | 49.368153 | 906136.0 | 41.763412 | 42.234259 | 41.666662 | 42.227810 | 2734897.0 | 27.314565 | 27.993757 | 27.171385 | 27.993757 | 133257.0 | NaN | NaN | NaN | NaN | NaN | 39.725997 | 40.033593 | 39.638945 | 40.021984 | 2022078.0 | 30.165420 | 30.165420 | 29.641154 | 29.842794 | 784200.0 | 50.129575 | 50.751911 | 50.015382 | 50.552078 | 1676777.0
NaN | NaN | NaN | NaN | NaN | 27.169440 | 27.202834 | 26.873077 | 27.019175 | 488178.0 | NaN | NaN | NaN | NaN | NaN | 12.339255 | 12.410758 | 12.050434 | 12.109323 | 137034.0 | 5.120282 | 5.153706 | 5.042056 | 5.042056 | 279601.0 | 39.831689 | 40.024737 | 39.522816 | 39.638645 | 331772.0 | 61.763061 | 62.020048 | 61.563181 | 61.962940 | 797939.0 | 8.739330 | 8.864759 | 8.673479 | 8.780094 | 2735783.0 | ... | 25.800709 | 25.820882 | 25.293508 | 25.457775 | 3292371.0 | 49.368148 | 49.666657 | 47.592428 | 48.411404 | 893352.0 | 42.182657 | 42.498704 | 41.911759 | 42.253605 | 2584717.0 | 27.894637 | 30.185536 | 27.843238 | 29.744978 | 420128.0 | NaN | NaN | NaN | NaN | NaN | 40.045200 | 40.213507 | 39.540283 | 39.766628 | 2040667.0 | 29.850860 | 29.875059 | 29.003970 | 29.318529 | 1147700.0 | 50.529229 | 51.208663 | 50.415040 | 50.706223 | 1742206.0
NaN | NaN | NaN | NaN | NaN | 27.048393 | 27.219531 | 26.772899 | 27.002478 | 511138.0 | 39.589531 | 39.589531 | 39.589531 | 39.589531 | 0.0 | 12.222885 | 12.222885 | 11.964912 | 12.046229 | 77268.0 | 5.034945 | 5.148729 | 5.034945 | 5.034945 | 98025.0 | 39.681540 | 39.810238 | 39.595743 | 39.608612 | 342784.0 | 62.105725 | 62.191389 | 61.363310 | 61.563194 | 727120.0 | 8.767549 | 8.924336 | 8.733056 | 8.877300 | 2595800.0 | ... | 25.486592 | 25.754599 | 25.379965 | 25.619154 | 2345128.0 | 48.671639 | 49.896274 | 47.929201 | 49.750847 | 1202268.0 | 42.214906 | 42.479355 | 42.079458 | 42.408405 | 2166413.0 | 29.744973 | 30.086398 | 29.047423 | 29.939554 | 231260.0 | NaN | NaN | NaN | NaN | NaN | 39.714390 | 39.934928 | 39.464832 | 39.691177 | 1535087.0 | 29.439508 | 29.737935 | 29.286262 | 29.576622 | 993200.0 | 50.900349 | 51.265758 | 50.671968 | 51.100182 | 1362139.0
NaN | NaN | NaN | NaN | NaN | 26.960732 | 27.624421 | 26.960732 | 27.570158 | 1088580.0 | 39.617683 | 39.617683 | 39.617683 | 39.617683 | 0.0 | 12.057447 | 12.079879 | 11.917244 | 12.029407 | 74370.0 | 5.034944 | 5.089703 | 4.999387 | 5.047034 | 115634.0 | 39.625772 | 39.681540 | 38.960837 | 39.209652 | 481551.0 | 61.334759 | 61.763075 | 61.134880 | 61.705967 | 948916.0 | 8.849081 | 8.902388 | 8.792638 | 8.858488 | 2907679.0 | ... | 25.457778 | 25.578815 | 25.218591 | 25.449133 | 2505352.0 | 49.727887 | 50.753518 | 49.314570 | 50.324898 | 1108024.0 | 42.376153 | 43.085651 | 42.034307 | 42.911499 | 3336865.0 | 31.279587 | 33.915587 | 31.261229 | 33.776077 | 1470941.0 | NaN | NaN | NaN | NaN | NaN | 39.551892 | 39.685378 | 39.197872 | 39.522873 | 1765257.0 | 29.383050 | 29.471772 | 28.778129 | 29.124950 | 877300.0 | 50.769027 | 51.065924 | 50.443584 | 50.512100 | 1926360.0
NaN | NaN | NaN | NaN | NaN | 27.570161 | 27.707905 | 26.964909 | 27.290491 | 730274.0 | NaN | NaN | NaN | NaN | NaN | 12.141570 | 12.149982 | 11.777042 | 11.777042 | 74403.0 | 5.042056 | 5.081880 | 5.034944 | 5.056279 | 90482.0 | 39.252552 | 39.484207 | 37.871205 | 38.077122 | 978115.0 | 62.020063 | 62.020063 | 61.020660 | 61.591747 | 902151.0 | 8.867895 | 9.109347 | 8.817723 | 9.068583 | 5462896.0 | ... | 25.503885 | 25.610512 | 25.126371 | 25.371323 | 3307737.0 | 50.439708 | 51.358185 | 50.294281 | 50.868328 | 937837.0 | 43.117893 | 43.362992 | 42.685749 | 43.182396 | 2634386.0 | 33.812793 | 34.756320 | 32.718743 | 34.400204 | 449586.0 | NaN | NaN | NaN | NaN | NaN | 39.673771 | 39.784040 | 39.041174 | 39.464836 | 1665122.0 | 29.116886 | 29.358857 | 28.447441 | 28.818459 | 1224500.0 | 50.917475 | 50.917475 | 49.490096 | 50.238041 | 2425290.0
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ...
48.683556 | 49.280001 | 48.556444 | 48.800888 | 544896.0 | 134.279385 | 135.566712 | 134.031820 | 134.972565 | 193144.0 | 204.061152 | 206.339508 | 203.367744 | 206.339508 | 751699.0 | 36.296987 | 37.071978 | 36.159648 | 36.738438 | 1330030.0 | 41.357513 | 41.710410 | 41.220276 | 41.543766 | 569336.0 | 90.810540 | 92.596589 | 90.592251 | 92.358452 | 327184.0 | 227.299843 | 229.993200 | 226.626507 | 228.165558 | 310667.0 | 39.068311 | 39.233857 | 38.834605 | 38.951458 | 2387989.0 | ... | 60.852591 | 61.519835 | 60.661949 | 60.862122 | 2923353.0 | 146.430482 | 147.387668 | 145.023997 | 145.434219 | 952157.0 | 111.475366 | 111.884555 | 110.695958 | 111.397423 | 1390621.0 | 117.800003 | 118.900002 | 117.500000 | 118.400002 | 325628.0 | 55.863411 | 56.387859 | 55.669170 | 56.154770 | 1031102.0 | 31.953356 | 32.382001 | 31.846195 | 32.372257 | 1600706.0 | 87.669159 | 88.728963 | 87.551408 | 88.454201 | 329138.0 | 186.491139 | 188.844513 | 185.936275 | 186.759003 | 969429.0
48.986666 | 49.553779 | 48.986666 | 49.240891 | 615538.0 | 134.972562 | 136.012339 | 134.378399 | 135.071579 | 202684.0 | 207.726336 | 209.113168 | 206.042339 | 208.617874 | 549094.0 | 36.895399 | 37.699818 | 36.846350 | 37.483997 | 1204434.0 | 41.465345 | 41.877059 | 41.435938 | 41.739822 | 380069.0 | 92.517205 | 93.053019 | 92.159995 | 92.457672 | 231334.0 | 228.454137 | 233.071319 | 228.357940 | 232.590363 | 246062.0 | 39.146216 | 39.185168 | 38.883293 | 38.951458 | 2036098.0 | ... | 60.976510 | 61.939249 | 60.966976 | 61.662819 | 1802073.0 | 147.250934 | 149.575542 | 146.801649 | 148.911377 | 944403.0 | 111.904042 | 113.988968 | 111.670221 | 113.988968 | 1149343.0 | 118.949997 | 122.199997 | 118.550003 | 121.650002 | 471544.0 | 56.154770 | 56.873459 | 56.154770 | 56.659794 | 603944.0 | 32.537871 | 33.044445 | 32.440450 | 32.927544 | 1523201.0 | 88.552333 | 88.846725 | 87.963555 | 88.787849 | 212154.0 | 186.567662 | 191.666657 | 186.548525 | 191.217026 | 987162.0
49.788442 | 49.827554 | 49.201777 | 49.299553 | 439192.0 | 135.319144 | 137.497730 | 135.319144 | 136.210388 | 155267.0 | 211.193396 | 213.967046 | 209.212220 | 211.688690 | 692211.0 | 37.866585 | 38.298227 | 37.837156 | 37.915638 | 968546.0 | 42.151532 | 42.680877 | 41.779029 | 42.396599 | 366515.0 | 93.072874 | 94.422332 | 92.933960 | 94.104805 | 190535.0 | 235.764677 | 236.245633 | 232.975123 | 234.802765 | 225114.0 | 40.022620 | 40.158953 | 39.691533 | 39.964195 | 2223149.0 | ... | 62.148946 | 62.673212 | 62.120351 | 62.168015 | 2008445.0 | 152.329905 | 152.408030 | 149.204376 | 149.204376 | 686060.0 | 115.352942 | 116.327207 | 115.274999 | 116.229782 | 1067229.0 | 123.000000 | 124.150002 | 122.400002 | 123.050003 | 321404.0 | 57.028850 | 57.514450 | 56.601520 | 57.456177 | 602391.0 | 33.122383 | 34.116055 | 33.122383 | 34.116055 | 1845090.0 | 89.474755 | 90.671940 | 89.298121 | 90.299042 | 353813.0 | 193.053823 | 195.062798 | 192.814658 | 193.819153 | 738979.0
49.573336 | 50.003557 | 49.475556 | 49.592892 | 662659.0 | 137.151130 | 138.042358 | 136.507466 | 136.903564 | 151454.0 | 213.967040 | 214.066105 | 210.400916 | 211.193390 | 508977.0 | 38.131456 | 38.200126 | 37.582098 | 37.601719 | 864433.0 | 42.553442 | 43.082788 | 42.426007 | 42.710285 | 457634.0 | 94.700160 | 95.176435 | 94.263568 | 94.660469 | 193822.0 | 235.668482 | 237.592307 | 233.937037 | 234.514191 | 258539.0 | 40.266066 | 40.314758 | 39.516253 | 39.662319 | 1880277.0 | ... | 62.663677 | 62.844789 | 61.824860 | 62.025032 | 2112445.0 | 150.024838 | 151.079687 | 147.797908 | 148.364395 | 738894.0 | 116.327207 | 116.658453 | 114.144856 | 114.378677 | 902680.0 | 123.800003 | 123.949997 | 122.050003 | 122.750000 | 388663.0 | 57.611568 | 58.388527 | 57.261936 | 57.825233 | 668538.0 | 34.193985 | 34.505725 | 33.960182 | 34.047859 | 1677246.0 | 90.671937 | 91.084080 | 90.338291 | 90.730812 | 245532.0 | 194.536643 | 195.445466 | 192.814654 | 193.197311 | 665400.0
49.475555 | 49.768892 | 49.358223 | 49.358223 | 339126.0 | 137.497732 | 138.983124 | 136.953086 | 138.983124 | 166548.0 | 210.995285 | 211.193400 | 209.410338 | 210.995285 | 340269.0 | 37.523235 | 37.572284 | 36.856155 | 37.121025 | 1072853.0 | 42.690678 | 42.994563 | 42.641666 | 42.827915 | 266469.0 | 94.541393 | 94.660466 | 93.707909 | 93.707909 | 121069.0 | 234.321809 | 236.245634 | 233.552277 | 233.552277 | 165429.0 | 39.633109 | 39.788914 | 39.428614 | 39.438351 | 1227197.0 | ... | 62.339593 | 62.339593 | 61.548432 | 61.691414 | 1144797.0 | 148.090917 | 148.872299 | 146.489090 | 148.872299 | 578887.0 | 114.144854 | 115.274997 | 114.086393 | 114.495583 | 796354.0 | 122.400002 | 126.500000 | 121.250000 | 121.250000 | 464126.0 | 57.786384 | 58.116592 | 57.669841 | 58.038895 | 505759.0 | 33.969921 | 34.155019 | 33.599733 | 33.677666 | 1036749.0 | 90.887829 | 91.064463 | 90.338297 | 90.573814 | 159727.0 | 192.527646 | 194.967129 | 192.001480 | 192.001480 | 526249.0
<p>2792 rows × 150 columns</p><!--EndFragment-->
So the problem is that raw_to_Xy
does not support zero (and negative) entries inside of raw_df
. Unfortunately, your DataFrame does have a couple of zero entries for the volumes (which I guess is not impossible).
The reason why 0's are not supported is because one then computes per day percentage changes which would lead to division by zero.
I guess the simplest way around this is to replace 0's with NAN's and that way the raw_to_Xy
will simply forward fill those values.
from deepdow.utils import raw_to_Xy
import numpy as np
import yfinance as yf
tickers = [
"1COV.DE",
"MRK.DE",
"LIN.DE",
"FRE.DE",
"DWNI.DE",
"BEI.DE",
"MUV2.DE",
"DPW.DE",
"BAS.DE",
"VOW3.DE",
"SIE.DE",
"CON.DE",
"VNA.DE",
"RWE.DE",
"HEN3.DE",
"ALV.DE",
]
raw_df = yf.download(tickers = tickers,
start="2010-01-01",
end="2021-01-01",
interval = "1d",
group_by = 'ticker',
)
print((raw_df == 0).sum().sum())
raw_df_adj = raw_df.replace(0, np.nan)
print((raw_df_adj == 0).sum().sum())
lookback, gap, horizon = 5, 2, 4
X, timestamps, y, asset_names, indicators = raw_to_Xy(raw_df_adj,
lookback=lookback,
gap=gap,
freq="B",
horizon=horizon)
Standard output:
[*********************100%***********************] 16 of 16 completed
223
0
Anyway, maybe deepdow
should raise an exception to let the user know that none of the assets made it through the check. The exception you saw in your traceback is coming from pandas
because things do not work out evenutally (asset_names=[]
in the code).
Ah, that make sense! The code is running now, thanks!
Another short question because it seems my problems are directly related to #30:
My n_sample size is not n_timesteps - lookback - horizon - gap + 1
, it seems my vector from the raw_to_df
function is significantly larger (2776 vs 2851). Is that due to the forward and backward fill of the gaps in my dataset?
Ah, that make sense! The code is running now, thanks! Another short question because it seems my problems are directly related to #30: My n_sample size is not
n_timesteps - lookback - horizon - gap + 1
, it seems my vector from theraw_to_df
function is significantly larger (2776 vs 2851). Is that due to the forward and backward fill of the gaps in my dataset?
Yes, exactly! The assert
in the docs actually only holds for the data in the docs, not in general:) It is actually also related to this https://github.com/jankrepl/deepdow/issues/71#issuecomment-713786289.
Perfect! Thanks a lot for your help!
Hello, I am running into a weird error, that I don't understand. I used Yahoo Finance to download a DataFrame with the Index of MultiIndex([( 'SIE.DE', 'Open'), ( 'SIE.DE', 'High'), ( 'SIE.DE', 'Low'), ( 'SIE.DE', 'Close'), ( 'SIE.DE', 'Volume'), ('EOAN.DE', 'Open'), ('EOAN.DE', 'High'), ('EOAN.DE', 'Low'), ('EOAN.DE', 'Close'), ('EOAN.DE', 'Volume'), ... ('BAYN.DE', 'Open'), ('BAYN.DE', 'High'), ('BAYN.DE', 'Low'), ('BAYN.DE', 'Close'), ('BAYN.DE', 'Volume'), ( 'BEI.DE', 'Open'), ( 'BEI.DE', 'High'), ( 'BEI.DE', 'Low'), ( 'BEI.DE', 'Close'), ( 'BEI.DE', 'Volume')], length=150) So I have a DataFrame, where there are the Assets in the zero level and the first level being indecators. If i run now the raw_to_Xy Function like this
I receive an KeyError: 'Open' with the explanation
I tried to stay as close to the documentation as possible and don't see, where I am wrong. Other than that: Great repository, I am really fascinated by all the possibilties that this gives for modern Portfolio Optimization!