stefan-jansen / machine-learning-for-trading

Code for Machine Learning for Algorithmic Trading, 2nd edition.
https://ml4trading.io
12.57k stars 4.03k forks source link

Chapter 2 notebook 1. Key error. #296

Closed gonzalodes closed 1 year ago

gonzalodes commented 1 year ago

Describe the bug I have trouble with the keys that the last line of code in the notebook calls. Although I understand the logic behind the code the key simply does not apear to exist. I've runed it on MacOs and Windows, obtaining in both the same resulsts. I've also changed the nasdaq file many times. I've also tried to acces "P" and "Q" in many other ways navegating the HDF file but always had little to no luck. Please help me.

KeyError Traceback (most recent call last) Cell In[17], line 5 2 print(store.S.keys()) 4 stocks = store['R'].loc[:, ['stock_locate', 'stock']] ----> 5 trades = store['P'].append(store['Q'].rename(columns={'cross_price': 'price'}), sort=False).merge(stocks) 7 trades['value'] = trades.shares.mul(trades.price) 8 trades['value_share'] = trades.value.div(trades.value.sum())

File ~/opt/miniconda3/envs/ml4t/lib/python3.8/site-packages/pandas/io/pytables.py:596, in HDFStore.getitem(self, key) 595 def getitem(self, key: str): --> 596 return self.get(key)

File ~/opt/miniconda3/envs/ml4t/lib/python3.8/site-packages/pandas/io/pytables.py:790, in HDFStore.get(self, key) 788 group = self.get_node(key) 789 if group is None: --> 790 raise KeyError(f"No object named {key} in the file") 791 return self._read_group(group)

KeyError: 'No object named P in the file'

stefan-jansen commented 1 year ago

What's the output in the cell with number 32 here: https://github.com/stefan-jansen/machine-learning-for-trading/blob/main/02_market_and_fundamental_data/01_NASDAQ_TotalView-ITCH_Order_Book/01_parse_itch_order_flow_messages.ipynb?

gonzalodes commented 1 year ago

image

The counter is different than the original notebook because I changed the first file for '01302020.NASDAQ_ITCH50.gz'.

stefan-jansen commented 1 year ago

The code in the previous cell stores all messages so if type P shows up in the counter it should be in the store. What does

with pd.HDFStore(itch_store) as store:
    print(store.info())

show?

gonzalodes commented 1 year ago

image That's the thing, I've already try that and if P apears the key should be in store but it doesn't. It seems like there were only defined some of the messages but not all.

stefan-jansen commented 1 year ago

I would suggest you add some debug print statements into the loop to check what gets saved.

Since you get a count for P, the message_type clearly shows up in the source data:

        message_type = data.read(1).decode('ascii')        
        message_type_counter.update([message_type])

For instance, you might want to check if len(messages[message_type]) ever > 0 after the message_type in question shows up.

gonzalodes commented 1 year ago

I already did that and it shows up, it's very weird it seems that the problem is with the HDF file.

stefan-jansen commented 1 year ago

Sorry but can't replicate this issue. Closing for now, feel free to reopen if issue persists.