cloojure / pcapng

PCAPNG Python library
Apache License 2.0
5 stars 4 forks source link

Document how to read PCAPNG files #3

Open cmcqueen opened 2 years ago

cmcqueen commented 2 years ago

It would be great to have some documentation and/or example code of how to read and parse PCAPNG files using this package.

I have some code that works at the moment, but it's incomplete and I'm sure not very robust. I also wonder if my code is more verbose than it needs to be.

cloojure commented 2 years ago

The README supplies some examples, as do testing code such as block_test.py.

cmcqueen commented 2 years ago

There are some functions to decode blocks. But, they don't handle partial block reads well. They require the caller to read either the whole PCAPNG file at once, or to read the block headers and read the right number of bytes for each block one-by-one.

It would be nice for a user program to be able to just do chunked reads of a PCAPNG file (eg reading 1 kB at a time), and pass the chunk to a parsing function. The function could return any whole blocks decoded, and return any remaining bytes of partially-read blocks to be prepended to the next chunked read. Eg

with open(pcap_filename, "rb") as f:
    unprocessed_bytes = b""
    while True:
        bytes_data = f.read(1024)
        if len(bytes_data) == 0:
            break
        blocks_list, unprocessed_bytes = pcapng.segment_chunk(unprocessed_bytes + bytes_data)
        for block in blocks_list:
            do_something_with_block(block)
cmcqueen commented 2 years ago

I'm using the following code:

    def segment_chunk(pcapng_bytes):
        '''Unpack blocks in a chunk of a PCAPNG file.
        Return a list of complete blocks in the chunk, along with any "left-over" bytes.
        It is intended for processing chunked reads of a PCAPNG file.
        The "left-over" bytes can be prepended to the next chunk that is read.'''
        blocks_list = []
        dlen = len(pcapng_bytes)
        while dlen >= 8:
            block_type, block_length = struct.unpack("=II", pcapng_bytes[:8])
            if dlen < block_length:
                break
            block_packed_bytes = pcapng_bytes[:block_length]
            blocks_list.append(pcapng.block.unpack_dispatch(block_packed_bytes))
            pcapng_bytes = pcapng_bytes[block_length:]
            dlen = len(pcapng_bytes)
        return blocks_list, pcapng_bytes

    def file_block_gen(f, read_chunk_size=4096):
        '''Generator to read a PCAPNG file in chunks, unpacking the blocks in each chunk.
        Yield each block in the file one-by-one.'''
        unprocessed_bytes = b""
        while True:
            bytes_data = f.read(read_chunk_size)
            if len(bytes_data) == 0:
                if len(unprocessed_bytes):
                    yield pcapng.block.unpack_dispatch(unprocessed_bytes)
                break
            blocks_list, unprocessed_bytes = segment_chunk(unprocessed_bytes + bytes_data)
            for block in blocks_list:
                yield block

I have some code that opens the file, calls file_block_gen(f). It does some special handling of the first block to check it's a Section Header Block (SHB), and the second block to check it's an Interface Description Block (IDB).