JarryShaw / PyPCAPKit

Python-based Comprehensive Network Packet Analysis Library
https://jarryshaw.github.io/PyPCAPKit/
BSD 3-Clause "New" or "Revised" License
239 stars 30 forks source link

Errors reading pcap files, both in cli and code #240

Open salderma opened 1 month ago

salderma commented 1 month ago

Describe the bug Just starting to investigate how to use your library. My goal is to explore network traffic captures for ML algorithm exploitation, I've done some work with other libraries, but having difficulty with their data structures, so I thought I'd give this one a shot.

For simplicity to ensure things are installed and functional, I ran the cli command against a sample pcap like so: pcapkit-cli -f tree --verbose HomeLabDMZ.pcap this produces expected output up to around the 206th packet then produces an error. As best I can tell, the error comes from the httpv1.py library on line 131, where the packet seems to have been incorrectly typed as HTTP, when the packet is actually an ACK from the HTTP Server to the Client for the previous HTTP DATA packet in the stream.

System information A clear and concise description of your system information.

Traceback stack

Frame 208: Ethernet:IPv4:TCP
[ERROR] 10/22/2024 10:47:59 AM - not enough values to unpack (expected 2, got 1)
Traceback (most recent call last):
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/utilities/decorators.py", line 117, in behold
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 1156, in _import_next_layer
    next_ = protocol(file_, length, alias=proto, packet=packet,
            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 521, in __init__
    self.__post_init__(file, length, **kwargs)  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/application/application.py", line 75, in __post_init__
    super().__post_init__(file, length, **kwargs)  # type: ignore[arg-type]
    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 555, in __post_init__
    self._info = self.unpack(length, **kwargs)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 281, in unpack
    return self.read(length, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/application/httpv1.py", line 131, in read
    header, body = packet.split(b'\r\n\r\n', maxsplit=1)
    ^^^^^^^^^^^^
ValueError: not enough values to unpack (expected 2, got 1)
Stack (most recent call last):
  File "/home/salderma/.pyenv/versions/pcapml-3.12.7/bin/pcapkit-cli", line 8, in <module>
    sys.exit(main())
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/__main__.py", line 126, in main
    for _ in extractor:
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/foundation/extraction.py", line 849, in __next__
    return self._exeng.read_frame()
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/foundation/engines/pcap.py", line 145, in read_frame
    frame = Frame(ext._ifile, num=ext._frnum+1, header=self._gbhdr.info,
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 521, in __init__
    self.__post_init__(file, length, **kwargs)  # type: ignore[arg-type]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/misc/pcap/frame.py", line 357, in __post_init__
    self._info = self.unpack(length, _read=_read, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/misc/pcap/frame.py", line 199, in unpack
    return self.read(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/misc/pcap/frame.py", line 267, in read
    return self._decode_next_layer(frame, self._ghdr.network, frame.len)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/misc/pcap/frame.py", line 462, in _decode_next_layer
    next_ = cast('Protocol', self._import_next_layer(proto, length, packet=packet))  # type: ignore[misc,call-arg,redundant-cast]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/utilities/decorators.py", line 117, in behold
    return func(*args, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 1156, in _import_next_layer
    next_ = protocol(file_, length, alias=proto, packet=packet,
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 521, in __init__
    self.__post_init__(file, length, **kwargs)  # type: ignore[arg-type]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 555, in __post_init__
    self._info = self.unpack(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 281, in unpack
    return self.read(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/link/ethernet.py", line 138, in read
    return self._decode_next_layer(ethernet, _type, length - self.length)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 1108, in _decode_next_layer
    next_ = cast('ProtocolBase', self._import_next_layer(proto, length, packet=packet))  # type: ignore[misc,call-arg,redundant-cast]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/utilities/decorators.py", line 117, in behold
    return func(*args, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 1156, in _import_next_layer
    next_ = protocol(file_, length, alias=proto, packet=packet,
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 521, in __init__
    self.__post_init__(file, length, **kwargs)  # type: ignore[arg-type]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 555, in __post_init__
    self._info = self.unpack(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 281, in unpack
    return self.read(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/internet/ipv4.py", line 288, in read
    return self._decode_next_layer(ipv4, ipv4.protocol, ipv4.len - ipv4.hdr_len)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/internet/internet.py", line 182, in _decode_next_layer
    self._import_next_layer(proto, length, packet=packet, version=version))  # type: ignore[arg-type,misc,call-arg]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/utilities/decorators.py", line 117, in behold
    return func(*args, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/internet/internet.py", line 238, in _import_next_layer
    next_ = protocol(file_, length, version=version, extension=extension,  # type: ignore[abstract]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 521, in __init__
    self.__post_init__(file, length, **kwargs)  # type: ignore[arg-type]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 555, in __post_init__
    self._info = self.unpack(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 281, in unpack
    return self.read(length, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/transport/tcp.py", line 479, in read
    return self._decode_next_layer(tcp, (tcp.srcport.port, tcp.dstport.port), length - tcp.hdr_len)
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/transport/transport.py", line 158, in _decode_next_layer
    return super()._decode_next_layer(dict_, proto, length, packet=packet)  # type: ignore[arg-type]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/protocols/protocol.py", line 1108, in _decode_next_layer
    next_ = cast('ProtocolBase', self._import_next_layer(proto, length, packet=packet))  # type: ignore[misc,call-arg,redundant-cast]
  File "/home/salderma/.pyenv/versions/3.12.7/envs/pcapml-3.12.7/lib/python3.12/site-packages/pcapkit/utilities/decorators.py", line 126, in behold
    logger.error(str(exc), exc_info=exc, stack_info=DEVMODE, stacklevel=stacklevel())
  File "/home/salderma/.pyenv/versions/3.12.7/lib/python3.12/logging/__init__.py", line 1568, in error
    self._log(ERROR, msg, args, **kwargs)
  File "/home/salderma/.pyenv/versions/3.12.7/lib/python3.12/logging/__init__.py", line 1672, in _log
    fn, lno, func, sinfo = self.findCaller(stack_info, stacklevel)
  File "/home/salderma/.pyenv/versions/3.12.7/lib/python3.12/logging/__init__.py", line 1639, in findCaller
    traceback.print_stack(f, file=sio)
Frame 209: Ethernet:IPv4:TCP:Raw

Expected behavior I don't know what to expect, except that errors are unexpected.

Additional context I have a simple python script which is the beginnings of some ML work, that exhibits the same behavior with the same pcap file as input on the pcapkit.extract method. There are more instances of this error happening on other packets that seem to follow the same pattern, though sometimes it appears the conversation is https, not http.

salderma commented 1 month ago

Also, it appears that the PIP installation process does not ensure the installation of the python emoji package:

[WARNING] 10/22/2024 10:19:55 AM - dependency package 'emoji' not found

Not shown as an error, but still, would think it would be installed if it's a dependency.

JarryShaw commented 2 weeks ago

For the issue mentioned, would you please kindly confirm if you're still able to obtain the extracted packet data despite the console outputs? This is likely an issue with verbose error logging we have in place. We're looking to clean it up soon.

For CLI dependency, you can install the required packages thru pip install pypcapkit[cli]. We're updating the message to make it more clear as well.