noise-lab / netml

Feature Extraction and Machine Learning from Network Traffic Traces
Apache License 2.0
54 stars 16 forks source link

pcap2pandas(): KeyError for 'datetime' #31

Open Rameen-Mahmood opened 2 weeks ago

Rameen-Mahmood commented 2 weeks ago

Encountering an issue with the pcap2pandas() method in the netml.pparser.parser module. When a pcap file other than the provided demo.pcap is used, a KeyError: 'datetime' is raised.

Code:

from netml.pparser.parser import PCAP

pcap = PCAP('path/to/your.pcap')
pcap.pcap2pandas()

pdf = pcap.df

Error:

Exception has occurred: KeyError
'datetime'
KeyError: 'datetime'

The above exception was the direct cause of the following exception:

File "path/to/your_script.py", line X, in <module>
    pcap.pcap2pandas()
KeyError: 'datetime'

Environment:

jesteria commented 2 weeks ago

Hello. Thanks for the report. I can see how that might occur. Nonetheless is it possible for you to provide your input PCAP file?

Rameen-Mahmood commented 2 weeks ago

https://drive.google.com/file/d/116a6glE2LR6HmCMfuNPhaZvX5RTkCDSP/view?usp=sharing

jesteria commented 2 weeks ago

OK. None of the packets in that PCAP appear to contain any ethernet device information; but, this method was written to only consider packets which do contain ethernet device information.

Certainly, the method can be adjusted to handle this edge case, at least to provide a more useful error message.

Beyond that, do you expect this method to generate a DataFrame for this stream? I don't personally see why not; it would only be missing MAC address source and destination information. Nonetheless, I'd be curious to hear the thoughts of the author, @feamster, (judging by the record).

crazyideas21 commented 2 weeks ago

Turns out we don't have the MAC layer because we captured traffic using a WireGuard VPN (that's our method of getting mobile traffic).

@Rameen-Mahmood will try to disable MAC layer check and add dummy fields into the MAC layer. In our case, the MAC layer is boring because it only shows either the gateway or the phone.

Thank you @jesteria for the pointer to the code!

jesteria commented 1 week ago

If it helps, you might try the fix proposed by #32 – that is, in branch jsl/fix-dataframe-no-mac.