LibtraceTeam / libprotoident

Network traffic classification library that requires minimal application payload
GNU Lesser General Public License v3.0
206 stars 60 forks source link

Does this take care of VLAN headers as well? #14

Open mazkopolo opened 8 years ago

mazkopolo commented 8 years ago

Hi, I am not sure I should log it as an issue or not. Here is the problem. When I use bro IDS to parse my pcap file and compare my results with lib_protoident I see totally different results. I was wondering if the current version of lib_protoident takes VLAN headers into consideration or not?!

salcock commented 8 years ago

Yes, libprotoident is able to recognise and skip over VLAN headers.

Could you please provide some more detail about the problem you are experiencing? For starters, could you post some of the lines of lpi_protoident output and bro output that you think are disagreeing?

Also, what OS are you running libprotoident on (e.g. Debian, Fedora, Centos, Mac OS X, etc)?

mazkopolo commented 8 years ago

What do you mean by "Skip Over"? Please see below: When I use BRO IDS I get below results for a set of pcap files: Mega Bytes Received: 166,528 Mega Bytes Sent: 447,507

and when I use ProtoIdent I got the below results: Mega Bytes Received: 70,0694 Mega Bytes Sent: 274,312

Not only that, I am seeing different number of unique connections from each applications: BroIDS-> Number of unique outgoing connections: 848,623 ProtoIdent -> Number of unique outgoing connections: 219,957

mazkopolo commented 8 years ago

OK here is a simple test I ran for a very small pcap (3,604,924 bytes) file. =========Number of connections BROIDS: Number of flows: 92 LIP-ProtoIdent: Number of flows: 83 Wireshark shows 91 number of TCP and UDP conversations

=========Volumetric comparison BROIDS: Sum Byte In:3,060,568 BROIDS: Sum Byte Out: 374,668 BroIDS ->SUM: Byte In+ Byte Out = 3,435,236

LIP_ProtoIdent: Sum Byte In:2,882,012 LIP_ProtoIdent: Sum Byte Out: 239,763 LIP_ProtoIdent ->SUM: Byte In+ Byte Out = 3,121,775

This clearly shows that lpi_protoident is skipping something?!

salcock commented 8 years ago

I can't speak for BRO, but libprotoident ignores TCP flows that did not start with the time period covered by the pcap file (i.e. flows were the TCP handshake occurred before the capture started). This is because libprotoident relies on seeing the first payload bearing packets for the flow -- if we don't see the handshake then we can't guarantee we're looking at the right packets.

This is probably the reason why you are seeing less flows and bytes than other tools.

mazkopolo commented 8 years ago

I ran another experiment to verify your assumption. I used a port mirroring box to capture all Internet traffic (incoming and outgoing) from a machine. I had started capturing Internet flow before I switched on the machine in order to make sure I am not missing any TCP handshake packets. after browsing a few websites and checking emails, I switched off the machine and then I stopped capturing file. The result was a pcap file that I used to run my experiments across both Bro IDS and lib_protoident. Here is the results:

Application Byte In Byte Out SUM Bro 50,103,399 1,660,354 51,763,753 LPI 39,083,769 9,864,946 48,948,715

I realized lip_protoident could not process below table: diff