kbandla / dpkt

fast, simple packet creation / parsing, with definitions for the basic TCP/IP protocols
Other
1.1k stars 271 forks source link

An Unexpected error blank packet generated when dividing traffic flows into five-tuple #661

Open student-limo opened 11 months ago

student-limo commented 11 months ago

It is my first time creating an issue on Github so if I've done anything incorrectly I apologize in advance.

Describe the bug I have encountered an issue with the dpkt package related to dividing flows into five-tuple. When I checked the newly generated pcap file, I found a malformed blank packet at the end of the file. I don't know how did this happen.

Code To Reproduce Here is the code I used to split the traffic into 5-tuple.

import dpkt
import socket
import os

def pcap_ana(pcap_path, save_path):
    """
    read pcap file and record flow
    in order to open once and write many times a flow.pcap file
    """
    with open(pcap_path, 'rb') as f:
        f.seek(0)
        capture = dpkt.pcap.Reader(f)
        flow_record = {}
        for ts, pkt in capture:
            eth = dpkt.ethernet.Ethernet(pkt)
            if isinstance(eth.data, dpkt.ip.IP): 
                ip = eth.data
                if isinstance(ip.data, dpkt.tcp.TCP):
                    tproto = "TCP"
                elif isinstance(ip.data, dpkt.udp.UDP):
                    tproto = "UDP"
                else:
                    continue
                trsm = ip.data
                sport = trsm.sport
                dport = trsm.dport
                flow = socket.inet_ntoa(ip.src) + '_' + str(sport) + '_' + socket.inet_ntoa(ip.dst) + '_' + str(
                    dport) + '_' + tproto
                flow_rvs = socket.inet_ntoa(ip.dst) + '_' + str(dport) + '_' + socket.inet_ntoa(ip.src) + '_' + str(
                    sport) + '_' + tproto
                if flow in flow_record.keys():
                    flow_record[flow].append([pkt, ts])
                elif flow_rvs in flow_record.keys():
                    flow_record[flow_rvs].append([pkt, ts])
                else:
                    flow_record[flow] = []
                    flow_record[flow].append([pkt, ts])
    flow_ana(flow_record, save_path)

def flow_ana(flow_record, save_path):
    """
    write pcap file according to flow_record dict
    """
    for key in flow_record:
        flow_path = save_path + key + '.pcap'
        file = open(flow_path, 'ab')
        writer = dpkt.pcap.Writer(file)
        for record in flow_record[key]:
            eth = record[0]
            tist = record[1]
            writer.writepkt(eth, ts=tist)
        file.flush()
        file.close()

if __name__ == "__main__":
    pcaps_dir_ori = "./****/" # Path of the folder to be processed
    for f in os.listdir(pcaps_dir_ori):
         folder_name = f.split('.')[0]
         pcaps_dir = "./****/" # Path of the output folder
         flow_cut(pcaps_dir_ori + f, pcaps_dir)

Screenshots Here is a screenshot of the output pcap file. 28609d5b909d35292848fb7d4a70fe4

Details(please complete the following information):

obormot commented 11 months ago

Is there a chance your input pcap file have truncated packets?

kbandla commented 11 months ago

As @obormot pointed, frame 22 has zero length. You can check for pkt length and abort if its zero. However, the next packet might potentially be erroneous.
How did this happen? Could be many reasons, but check your snaplen value and try with a larger value, or zero (65535). Your code looks alright, it should work with a healthy pcap. Hope this helps.