simsong / bulk_extractor

This is the development tree. Production downloads are at:
https://github.com/simsong/bulk_extractor/releases
Other
1.07k stars 185 forks source link

IPv6 packets not extracted #351

Open erik4711 opened 2 years ago

erik4711 commented 2 years ago

See screenshot below with IPv6 packets that have been carved from Ali Hadi's memdump.mem with CapLoader. memdump mem BC7B2608 Unfortunately BE fails to carve any IPv6 packets in that same memdump. I'm using bulk_extractor 2.0.1 in Windows.

simsong commented 2 years ago

Thank you for posting this. IPv6 carving is working in some cases but apparently not this one.

Your memory dump also demonstrates some other errors. On my mac the program hangs after processing the input. I'm not sure why.

Here is what the output of memdump.mem with bulk_extractor 2.0 running on MacOS:

(base) simsong@Seasons out-mem % ls -l *pcap
-rw-r--r--  1 simsong  staff  176599 Mar 17 07:19 packets.pcap
(base) simsong@Seasons out-mem % tcpdump -r packets.pcap
...
-4:00:00.000000 IP 192.168.56.101.62184 > 239.255.255.250.upnp-discovery: UDP, length 626
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.100.bootps > broadcasthost.bootpc: BOOTP/DHCP, Reply, length 548
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-dgm > 192.168.56.255.netbios-dgm: UDP, length 209
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-dgm > 192.168.56.255.netbios-dgm: UDP, length 201
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-dgm > 192.168.56.255.netbios-dgm: UDP, length 209
-4:00:00.000000 IP 192.168.56.1.63832 > 224.0.0.252.llmnr: UDP, length 22
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.100.bootps > broadcasthost.bootpc: BOOTP/DHCP, Reply, length 548
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-dgm > 192.168.56.255.netbios-dgm: UDP, length 201
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.100.bootps > broadcasthost.bootpc: BOOTP/DHCP, Reply, length 548
-4:00:00.000000 IP 192.168.56.100.bootps > broadcasthost.bootpc: BOOTP/DHCP, Reply, length 548
-4:00:00.000000 IP 192.168.56.100.bootps > broadcasthost.bootpc: BOOTP/DHCP, Reply, length 548
-4:00:00.000000 IP 192.168.56.100.bootps > broadcasthost.bootpc: BOOTP/DHCP, Reply, length 548
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
-4:00:00.000000 IP 192.168.56.1.netbios-ns > 192.168.56.255.netbios-ns: UDP, length 50
...

But your IPv6 packets do not show up. Thanks.

erik4711 commented 2 years ago

CapLoader extracts 612 packets from that memdump (565 IPv4, 47 IPv6). BE extracts 564 packets (all IPv4).

Below is the output from tshark's Protocol Hierarchy Statistics for the packets carved with CapLoader and BE.

tshark -r memdump.mem.pcapng -z io,phs -q (packets from CapLoader)

ipv6                                     frames:10 bytes:8512
  tcp                                    frames:10 bytes:8512
    mysql                                frames:2 bytes:1682
      _ws.malformed                      frames:2 bytes:1682
        mysql                            frames:2 bytes:1682
          _ws.malformed                  frames:2 bytes:1682
            mysql                        frames:1 bytes:1500
              mysql                      frames:1 bytes:1500
                _ws.malformed            frames:1 bytes:1500
eth                                      frames:600 bytes:170429
  ip                                     frames:563 bytes:166538
    udp                                  frames:397 bytes:69236
      data                               frames:5 bytes:5748
      nbns                               frames:277 bytes:25598
      dhcp                               frames:52 bytes:28944
      nbdgm                              frames:26 bytes:6388
        smb                              frames:26 bytes:6388
          mailslot                       frames:26 bytes:6388
            browser                      frames:26 bytes:6388
      llmnr                              frames:37 bytes:2558
    tcp                                  frames:166 bytes:97302
      nbss                               frames:32 bytes:5438
        smb2                             frames:12 bytes:2212
        smb                              frames:16 bytes:2994
      http                               frames:6 bytes:4530
        xml                              frames:2 bytes:2540
          tcp.segments                   frames:2 bytes:2540
        mime_multipart                   frames:1 bytes:603
          tcp.segments                   frames:1 bytes:603
      data                               frames:70 bytes:76506
  ipv6                                   frames:37 bytes:3891
    udp                                  frames:37 bytes:3891
      data                               frames:1 bytes:688
      llmnr                              frames:36 bytes:3203
raw                                      frames:2 bytes:156
  ip                                     frames:2 bytes:156
    udp                                  frames:2 bytes:156
      llmnr                              frames:2 bytes:156

tshark -r packets.pcap -z io,phs -q (packets from BE 2.0.1)

eth                                      frames:564 bytes:167551
  data                                   frames:1 bytes:1013
  ip                                     frames:563 bytes:166538
    udp                                  frames:397 bytes:69236
      data                               frames:5 bytes:5748
      nbns                               frames:277 bytes:25598
      dhcp                               frames:52 bytes:28944
      nbdgm                              frames:26 bytes:6388
        smb                              frames:26 bytes:6388
          mailslot                       frames:26 bytes:6388
            browser                      frames:26 bytes:6388
      llmnr                              frames:37 bytes:2558
    tcp                                  frames:166 bytes:97302
      nbss                               frames:32 bytes:5438
        smb2                             frames:12 bytes:2212
        smb                              frames:16 bytes:2994
      http                               frames:6 bytes:4530
        xml                              frames:2 bytes:2540
          tcp.segments                   frames:2 bytes:2540
        mime_multipart                   frames:1 bytes:603
          tcp.segments                   frames:1 bytes:603
      data                               frames:70 bytes:76506
simsong commented 2 years ago

This is really helpful. Thanks.

simsong commented 2 years ago

@erik4711 - can you get me a copy of the pcap file? It seems that the IPv6 LLMNR may not be validating in my IPv6 validator. Thanks.

simsong commented 2 years ago

At offset 1063526424 is the packet that includes the string xzignbnfvi. Here is how the packet decodes.

Packet dump:

(base) simsong@Seasons ~ % xxd -g 1 memdump-1063526424.pkt | cut -c 10-59
 60 00 00 00 00 24 11 01 fe 80 00 00 00 00 00 00
 c4 6b 00 e4 0d 55 62 b1 ff 02 00 00 00 00 00 00
 00 00 00 00 00 01 00 03 f9 9e 14 eb 00 24 21 dc
 7e 2a 00 00 00 01 00 00 00 00 00 00 0a 78 7a 69
 67 6e 62 6e 66 76 69 00 00 01 00 01 

And here is the packet: https://github.com/simsong/bulk_extractor/blob/edffe27fc0822351c7e8deb76c28e5ee6642f5b1/src/tests/ipv6_packet.pkt

Here is the packet with a bogus Ethernet header:

A0 FF 70 C0 F2 5B 44 A9 2C 51 41 90 86 DD 60 00 
00 00 00 24 11 01 FE 80 00 00 00 00 00 00 C4 6B 
00 E4 0D 55 62 B1 FF 02 00 00 00 00 00 00 00 00 
00 00 00 01 00 03 F9 9E 14 EB 00 24 21 DC 7E 2A 
00 00 00 01 00 00 00 00 00 00 0A 78 7A 69 67 6E 
62 6E 66 76 69 00 00 01 00 01

here is how the packet decodes

simsong commented 2 years ago

Apparently, the IPv6 checksum (0xdc21) is not being properly computed.

erik4711 commented 2 years ago

@simsong Please see the attached zip file with packets carved from the memdump with CapLoader. packets-extracted-with-CapLoader.zip

For reference, these packets were carved with the default carving settings in CapLoader, as shown in this screenshot: CapLoader-carve-menu

The UDP checksum in the packet you referenced is 0x21dc (big endian), which is correct. Please note that UDP checksums are calculated differently depending on if they are encapsulated in an IPv4 or IPv6 frame.

simsong commented 2 years ago

Thank you. I have determined that the bulk_extractor ipv6 udp checksum computation is incorrect, which is why the ipv6 carving was not working properly. Do you have a validated IPv6 checksum computation for IPv6 TCP, UDP, ICMPv6 and other checksum-protocols?

erik4711 commented 2 years ago

The checksum validator I've written for the packet carvers in CapLoader and NetworkMiner Professional is proprietary. If you don't want to write your own, then maybe you can re-use the checksum implementation from Wireshark?

simsong commented 2 years ago

I've looked and can't easily find it. Do you know where it is? Would you be willing to look over my implementation and provide me with test cases? I'm mostly there. It's won't be as efficient as some crazy implementations out there, but it will be validated.

erik4711 commented 2 years ago

Sorry, I don't know where the checksum code is in Wireshark. But @guyharris probably does. I can run some memdumps through CapLoader's packet carver and provide you with the resulting pcapng files, so that we can compare the output from our tools. Do you have any particular memory dumps (or other types of files) that you'd like me to carve packets from?

guyharris commented 2 years ago

For the Internet checksum, used in the protocols you mention, the Wireshark code is in epan/in_cksum.c; it's based on the BSD checksum code.

simsong commented 2 years ago

Thanks. As I feared, that code is not usable for me. I can't figure out what it is doing. And, to be fair, that appears to be ipv4 checksum, not ipv6.

guyharris commented 2 years ago

As I feared, that code is not usable for me.

If "that code" refers to the code from Wireshark, then:

I can't figure out what it is doing.

RFC 791:

The checksum algorithm is:

  The checksum field is the 16 bit one's complement of the one's
  complement sum of all 16 bit words in the header.  For purposes of
  computing the checksum, the value of the checksum field is zero.

RFC 792:

Header Checksum

  The 16 bit one's complement of the one's complement sum of all 16
  bit words in the header.  For computing the checksum, the checksum
  field should be zero.  This checksum may be replaced in the
  future.

RFC 793:

Checksum: 16 bits

The checksum field is the 16 bit one's complement of the one's
complement sum of all 16 bit words in the header and text.  If a
segment contains an odd number of header and text octets to be
checksummed, the last octet is padded on the right with zeros to
form a 16 bit word for checksum purposes.  The pad is not
transmitted as part of the segment.  While computing the checksum,
the checksum field itself is replaced with zeros.

The checksum also covers a 96 bit pseudo header conceptually
prefixed to the TCP header.  This pseudo header contains the Source
Address, the Destination Address, the Protocol, and TCP length.
This gives the TCP protection against misrouted segments.  This
information is carried in the Internet Protocol and is transferred
across the TCP/Network interface in the arguments or results of
calls by the TCP on the IP.

RFC 768:

Checksum is the 16-bit one's complement of the one's complement sum of a
pseudo header of information from the IP header, the UDP header, and the
data,  padded  with zero octets  at the end (if  necessary)  to  make  a
multiple of two octets.

What in_cksum() is doing is taking an array of pointer/length pairs, treating the concatenation of the blobs of bytes to which they refer as a single blob of bytes (without padding between them; any blob may contain an odd number of bytes), and computing the 16 bit one's complement of the one's complement sum of all 16 bit words in the blob; if the total number of bytes in the single blob is odd, the last 16 bit word is constructed as if that final byte were followed by a zero pad byte.

I.e., it performs the calculation used by the IPv4 header checksum, the TCP checksum, the ICMP checksum, and the UDP checksum. The array allows its callers to construct the pseudo-header in a local array and create a scatter/gather list containing the pseudo-header and the payload to be checksummed.

And, to be fair, that appears to be ipv4 checksum, not ipv6.

The IPv6 header has no checksum, so an IPv6 checksum would be a checksum for a protocol running atop IPv6.

RFC 8200 section 8.1 "Upper-layer Checksums" doesn't indicate that the "16-bit one's complement of the one's complement sum of 16-bit words, with a zero byte at the end to pad the blob of bytes to a multiple of 2 bytes" part of the checksum changes; it's just the pseudo-header that's different.

simsong commented 2 years ago

Thank you. I know the algorithms. It is wireshark code that I cannot figure out.

I have code for the UDPv6 working, but my packet has an even number of bytes. I will check your upload to see if some have an odd number, which has a different code code at the end.

I should have this working in a few days. I thank you for your help!

simsong commented 2 years ago

Thanks. I have a basic IPv6 UDP checksum operational. It works for a test packet that I found on StackOverflow but it doesn't work for the packet that I carved out of your memory. Here it is:

https://github.com/simsong/bulk_extractor/blob/f339ca85bbc0a94aeff4121e0a6c040f53b39698/src/scan_net.cpp#L206-L234

Any suggestions?

simsong commented 2 years ago

@erik4711 - Some of the packets from caploader have some bad checksums:

slg@lastdance ~ % tcpdump -r memdump.mem.pcapng -vvn
reading from PCAP-NG file memdump.mem.pcapng
-3:-59:-34.770452 IP6 (hlim 128, next-header TCP (6) payload length: 32) ::1.49228 > ::1.80: Flags [S], cksum 0x5e0c (correct), seq 838534241, win 8192, options [mss 65475,nop,wscale 2,nop,nop,sackOK], length 0
-3:-59:-33.071384 IP6 (hlim 128, next-header TCP (6) payload length: 142) ::1.51097 > ::1.3306: Flags [P.], cksum 0x9f79 (incorrect -> 0xe31b), seq 632006740:632006862, ack 2354455886, win 31, length 122
-3:-53:-38.477232 IP6 (hlim 128, next-header TCP (6) payload length: 1158) ::1.80 > ::1.51140: Flags [P.], cksum 0xb9ac (incorrect -> 0xbb72), seq 2203675899:2203677037, ack 1808921540, win 29, length 1138: HTTP
-3:-42:-25.322320 IP6 (hlim 128, next-header TCP (6) payload length: 1460) ::1.3306 > ::1.51077: Flags [.], cksum 0x0a2a (incorrect -> 0x6928), seq 701310533:701311973, ack 611766742, win 29, length 1440
-3:-42:-21.097800 IP6 (hlim 128, next-header TCP (6) payload length: 160) ::1.80 > ::1.51123: Flags [P.], cksum 0xf688 (incorrect -> 0xef5a), seq 2678251386:2678251526, ack 343111608, win 27, length 140: HTTP
-3:-42:-21.356234 IP (tos 0x0, ttl 1, id 14757, offset 0, flags [none], proto UDP (17), length 654)
    192.168.56.101.62184 > 239.255.255.250.3702: [bad udp cksum 0xeb93 -> 0xfd39!] UDP, length 626
-3:-42:-21.431158 IP6 (hlim 1, next-header UDP (17) payload length: 634) fe80::3816:d72e:759b:76b9.62185 > ff02::c.3702: [bad udp cksum 0x4e3f -> 0x86b6!] UDP, length 626
tcpdump: packet printing is not supported for link type 101: use -w
-3:-42:-21.487904 %

Considering that my problem right now is getting checksums correct, do you think that I should just give up on them, and use other approaches for validating the packets?

simsong commented 2 years ago

@erik4711 - another issue with the memdump.mem.pcapng file you sent: it appears to be much larger than can be explained by the packets it contains:

slg@lastdance ~ % ls -l  memdump.mem.pcapng
-rw-r--r--@ 1 slg  staff  199284 Apr  4 14:59 memdump.mem.pcapng
slg@lastdance ~ % bc
bc 1.06
Copyright 1991-1994, 1997, 1998, 2000 Free Software Foundation, Inc.
This is free software with ABSOLUTELY NO WARRANTY.
For details type `warranty'.
32+142+1158+1460+160+654+626+626
4858
^D%
slg@lastdance ~ %

Any idea why the file is 199,284 bytes long but only appears to have 4858 bytes of data?

guyharris commented 2 years ago

It is wireshark code that I cannot figure out.

Ignore it. The only part of the Wireshark checksumming code of interest to anybody not doing Wireshark development is the part I yanked out of the BSD kernel and tweaked to use the scatter/gather list array rather than an mbuf chain, i.e. in_cksum().

erik4711 commented 2 years ago

@simsong The fact that you only see the first 8 packets in the pcapng file appears to be because tcpdump exits before having processed all the packets. I noticed this error message in your output: tcpdump: packet printing is not supported for link type 101: use -w

What are the odds that the tcpdump guru @guyharris himself is also active in this thread?!?

Guy: Do you know if this is this is due to a bug in tcpdump or if this is expected behaviour? The capture file we're referring to is memdump.mem.pcapng.

Here's the output from capinfos for that pcapng file, just to verify that it contains 612 packets rather than 8:

File name:           memdump.mem.pcapng
File type:           Wireshark/... - pcapng
File encapsulation:  Per packet
Encapsulation in use by packets (# of pkts):
                     Ethernet (600)
                     Raw IP (2)
                     Raw IPv6 (10)
File timestamp precision:  microseconds (6)
Packet size limit:   file hdr: (not set)
Number of packets:   612
File size:           199 kB
Data size:           179 kB
Capture duration:    1042,772972 seconds
First packet time:   1970-01-01 01:00:26,770452
Last packet time:    1970-01-01 01:17:49,543424
Data byte rate:      171 bytes/s
Data bit rate:       1 374 bits/s
Average packet size: 292,64 bytes
Average packet rate: 0 packets/s
SHA256:              4820b1b5cde71195d57207149f5c1330a7202c2dc363ccc5e530298e582f1337
RIPEMD160:           eb90e0b425c83eb28d81c111d668af6fc0687804
SHA1:                00534acea9375c0659e33550a5324b6b5a672c6a
Strict time order:   True
Number of interfaces in file: 3
Interface #0 info:
                     Encapsulation = Ethernet (1 - ether)
                     Capture length = 123456
                     Time precision = microseconds (6)
                     Time ticks per second = 1000000
                     Number of stat entries = 0
                     Number of packets = 600
Interface #1 info:
                     Encapsulation = Raw IP (7 - rawip)
                     Capture length = 123456
                     Time precision = microseconds (6)
                     Time ticks per second = 1000000
                     Number of stat entries = 0
                     Number of packets = 2
Interface #2 info:
                     Encapsulation = Raw IPv6 (130 - rawip6)
                     Capture length = 123456
                     Time precision = microseconds (6)
                     Time ticks per second = 1000000
                     Number of stat entries = 0
                     Number of packets = 10
simsong commented 2 years ago

Hm. Let me try another tcpdump...

(base) simsong@Seasons ~ % tcpdump --version
tcpdump version 4.9.3 -- Apple version 114.100.1
libpcap version 1.9.1
LibreSSL 3.3.5
(base) simsong@Seasons ~ % which tcpdump
/usr/sbin/tcpdump
(base) simsong@Seasons ~ % brew install tcpdump
... lots of stuff ...
(base) simsong@Seasons ~ % /opt/homebrew/bin/tcpdump -r memdump.mem.pcapng
reading from file memdump.mem.pcapng, link-type EN10MB (Ethernet), snapshot length 123456
tcpdump: pcap_loop: an interface has a type 101 different from the type of the first interface
(base) simsong@Seasons ~ % ls -l  memdump.mem.pcapng
-rw-r--r--@ 1 simsong  staff  199284 Apr  4 14:59 memdump.mem.pcapng
(base) simsong@Seasons ~ % openssl sha256  memdump.mem.pcapng
SHA256(memdump.mem.pcapng)= 4820b1b5cde71195d57207149f5c1330a7202c2dc363ccc5e530298e582f1337
(base) simsong@Seasons ~ %

Hm... Which version of tcpdump should I be using?

erik4711 commented 2 years ago

Hm... Which version of tcpdump should I be using?

Dunno, I can't get tcpdump to parse the pcapng file either. Use tshark if you wanna use a command line tool. Otherwise Wireshark is the way to go.

simsong commented 2 years ago

Ugh. Okay. Meanwhile, I've determined that I'm going to have to carve packets that don't have valid checksums.