DataSoft / Honeyd

virtual honeypots
GNU General Public License v2.0
348 stars 101 forks source link

Identifiability by fingerprinting of the ARP/NDP cache behavior #44

Open PherricOxide opened 11 years ago

PherricOxide commented 11 years ago

I noticed several differences in the ARP behavior between honeyd and real Linux machines when observing them in Wireshark. They all come down to traits of the ARP cache in Linux actually being a lot more complicated than a simple table with a max size and timeout values.

The simplest example to observe is to take two machines that have had no communication with each other yet and one of them running Honeyd.

Contacting a closed port on Honeyd will result in,

x.x.x.x         ->     Honeyot                : TCP SYN
Honeypot   ->     MAC broadcast  : Who has x.x.x.x?
x.x.x.x          ->     Honeyot               : x.x.x.x is at x.x.x.x.x.x
Honepot     ->     x.x.x.x                    : TCP ACK/RST

However, contacting a closed port on the Linux machine will result in,

LinuxIP       ->     Honeyot                : TCP SYN
Honeypot    ->    LinuxIP                  : TCP ACK/RST
(few seconds of delay)
Honeypot   ->     LinuxMAC           :  Who has x.x.x.x?
LinuxIP       ->     Honeyot               :  LinuxIP is at x.x.x.x.x.x

The behavior seemed odd at first, Linux will let you send an IP packet even if it hasn't done an ARP request on the IP address that it's destined to; when it sees packets from a new source IP, it will add the IP/MAC pair into the ARP Table in a "DELAY / not confirmed" state. At some point it will do an ARP request to ensure the MAC address actually belongs to the IP that it saw use it, but this delay can take several seconds, and in the meantime anyone who attempts to send to that IP address with an IP socket will actually use the MAC address that was seen in the source field of packets seen so far. Another related thing is that ARP requests can be send to the broadcast MAC address or a unicast MAC address. The Linux kernel will attempt to avoid using the broadcast MAC address whenever possible to reduce network clutter, and will instead send ARP requests directly to MAC addresses that are expiring in the table or the MAC address that was seen in the source field for an IP packet.

There are a few more anomalies that are easy to spot in Wireshark. In the Linux kernel, it will set ARP entries to "STALE" after as little as 60 seconds, resulting in a new ARP request when you contact the closed port. Honeyd will just cache the ARP value for 10 minutes (hard coded in a preprocessor macro).

Another interesting feature of the kernel ARP table is that it can change timeout values of ARP entries based on positive feedback from higher protocols. There is a "base_reachable_time_ms" value you can set, but the actual time the ARP entry stays in a "REACHABLE" state can be almost indefinite if the kernel sees TCP connections being maintained or having successful 3 way handshakes.

List of things that might be able to be used,