stefan-jansen / machine-learning-for-trading

Code for Machine Learning for Algorithmic Trading, 2nd edition.
https://ml4trading.io
12.87k stars 4.11k forks source link

Chapter 2: subsection - Parse_itch_order_flow_messages #229

Closed CDLim0906 closed 2 years ago

CDLim0906 commented 2 years ago

Dear Stefan, First of all, thank you so much for writing and putting up all the codes into such a great book. I am having some issue here,
while processing the binary file with the attached code below:

start = time()
with file_name.open('rb') as data:
while True:
    # determine message size in bytes
    message_size = int.from_bytes(data.read(2), byteorder='big', signed=False)

    # get message type by reading first byte
    message_type = data.read(1).decode('ascii')        
    message_type_counter.update([message_type])

    # read & store message
    try:
        record = data.read(message_size - 1)
        message = message_fields[message_type]._make(unpack(fstring[message_type], record))
        messages[message_type].append(message)
    except Exception as e:
        print(e)
        print(message_type)
        print(record)
        print(fstring[message_type])

    # deal with system events
    if message_type == 'S':
        seconds = int.from_bytes(message.timestamp, byteorder='big') * 1e-9
        print('\n', event_codes.get(message.event_code.decode('ascii'), 'Error'))
        print(f'\t{format_time(seconds)}\t{message_count:12,.0f}')
        if message.event_code.decode('ascii') == 'C':
            store_messages(messages)
            break
    message_count += 1

    if message_count % 2.5e7 == 0:
        seconds = int.from_bytes(message.timestamp, byteorder='big') * 1e-9
        d = format_time(time() - start)
        print(f'\t{format_time(seconds)}\t{message_count:12,.0f}\t{d}')
        res = store_messages(messages)
        if res == 1:
            print(pd.Series(dict(message_type_counter)).sort_values())
            break
        messages.clear()
print('Duration:', format_time(time() - start))

The code work perfectly fine when I use this file as the SOURCE_FILE: 10302019.NASDAQ_ITCH50.gz

while it shows an error when I switch to this file: 01302020.NASDAQ_ITCH50.gz

Both files are sample files from in the book.

Screenshots image

Your help would be greatly appreciated. Thanks again.

CDLim0906 commented 2 years ago

I deleted "01302020.NASDAQ_ITCH50.gz" and then redownloaded and unzipped it. Working fine now.

Bandita88 commented 1 year ago

hi maybe you can help me @CDLim0906 at the same output i receive an other error. could you help me? unhashable type: 'DataFrame' [S b'\x00\x00\x00\x00\t\xf6I\xc8\x0c\xd3O'

HH6ss


AttributeError Traceback (most recent call last) Input In [132], in <cell line: 2>() 23 # deal with system events 24 if message_type == 'S': ---> 25 seconds = int.from_bytes(message.timestamp, byteorder='big') * 1e-9 26 print('\n', event_codes.get(message.event_code.decode('ascii'), 'Error')) 27 print(f'\t{format_time(seconds)}\t{message_count:12,.0f}')

File C:\ProgramData\Anaconda3\lib\site-packages\pandas\core\generic.py:5575, in NDFrame.getattr(self, name) 5568 if ( 5569 name not in self._internal_names_set 5570 and name not in self._metadata 5571 and name not in self._accessors 5572 and self._info_axis._can_hold_identifiers_and_holds_name(name) 5573 ): 5574 return self[name] -> 5575 return object.getattribute(self, name)

AttributeError: 'DataFrame' object has no attribute 'timestamp'](url)