kbandla / dpkt

fast, simple packet creation / parsing, with definitions for the basic TCP/IP protocols
Other
1.09k stars 270 forks source link

pcap.Reader does not read Layer-2 packets from UNSW-NB15 dataset #492

Closed dhelmr closed 3 years ago

dhelmr commented 3 years ago

Hi,

I try to read a pcap file from the UNSW-NB15 dataset with dpkt, but dpkt.pcap.Reader does not give any IP packets at all, only dpkt.eth.Ethernet packets. I tested it with several of the dataset's pcap files, which can be downloaded here. The pcap files can be read with tcpdump or wireshark without any problems. Wireshark shows pcap as the filetype.

Example code for reproduction:

import dpkt

pcap_file = "data/unsw-nb15/01/2.pcap"  # download from unsw-nb15 dataset
reader = dpkt.pcap.Reader(open(pcap_file, "rb"))

non_ip_count = 0
ip_count = 0

for timestamp, buf in reader:
    eth = dpkt.ethernet.Ethernet(buf)

    if not isinstance(eth.data, dpkt.ip.IP):
        non_ip_count += 1
        continue

    ip_count += 1
    print("Got IP Packet: %s->%s" % ( eth.ip.src, eth.ip.dst))

print("IP packets: %s; Non-IP packets: %s" % (ip_count, non_ip_count))

Output:

❯ python test_dpkt_ip_packets.py
IP packets: 0; Non-IP packets: 1614980

This happens with both python 3.6 abd 3.8 when using dpkt 1.9.4 or 1.9.3. With version 1.9.2 or 1.9.1 the error is:

Traceback (most recent call last):
  File "/home/d/.local/lib/python3.8/site-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/home/d/.local/lib/python3.8/site-packages/dpkt/llc.py", line 36, in unpack
    dpkt.Packet.unpack(self, buf)
  File "/home/d/.local/lib/python3.8/site-packages/dpkt/dpkt.py", line 171, in unpack
    struct.unpack(self.__hdr_fmt__, buf[:self.__hdr_len__])):
struct.error: unpack requires a buffer of 3 bytes

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test_dpkt.py", line 13, in <module>
    eth = dpkt.ethernet.Ethernet(buf)
  File "/home/d/.local/lib/python3.8/site-packages/dpkt/ethernet.py", line 79, in __init__
    dpkt.Packet.__init__(self, *args, **kwargs)
  File "/home/dd/.local/lib/python3.8/site-packages/dpkt/dpkt.py", line 89, in __init__
    self.unpack(args[0])
  File "/home/d/.local/lib/python3.8/site-packages/dpkt/ethernet.py", line 168, in unpack
    self.data = self.llc = llc.LLC(self.data[:eth_len])
  File "/home/d/.local/lib/python3.8/site-packages/dpkt/dpkt.py", line 92, in __init__
    raise NeedData
dpkt.dpkt.NeedData

Am I doing something wrong? The errors also persists if I open the file in wireshark and try to save it as pcapng or pcap. Other pcap files, which I captured by myself, work (both pcapng and pcap). I would appreciate any hints or help if it is something I can solve by myself. Thanks in advance!

dhelmr commented 3 years ago

In fact, the reader does not seem to read any layer-2 protocols, not only IP (with layer 2 I mean everything above the link layer/ethernet)

obormot commented 3 years ago

@dhelmr pcap files you're referring to are 1.9G each. Would you mind attaching a smaller sample pcap here? (just a few packets is usually all that's needed to reproduce the issue)

dhelmr commented 3 years ago

Sure: pcap.zip

(This one I splitted with editcap -F pcap, if it matters. dpkt's behaviour is the same though)

obormot commented 3 years ago

Wireshark says it's the Linux cooked capture instead of Ethernet. dpkt has the SLL class for it:

In [13]: from dpkt.sll import SLL
In [14]: SLL(pp[1])
Out[14]: SLL(type=4, hdr=b'\x00PV\xa5wc\x00\x00', data=IP(tos=192, len=64, id=61204, ttl=1, p=89, sum=35426, 
src=b'\n(U\x01', dst=b'\xe0\x00\x00\x05', opts=b'', 
data=OSPF(v=2, type=1, len=44, router=3232297459, sum=60120, auth=b'\x00\x00\x00\x00\x00\x00\x00\x00', 
data=b'\xff\xff\xff\x00\x00\n\x02\x01\x00\x00\x00(\n(U\x01\x00\x00\x00\x00')))

hope this is helpful

dhelmr commented 3 years ago

Sorry, my bad. Thank you!