node-pcap / node_pcap

libpcap bindings for node
MIT License
928 stars 253 forks source link
packet-capture pcap sniffer

Disclaimer: There's been some API changes between v2 and v3; the createSession and createOfflineSession arguments now accept an options object. Also, if you're capturing on monitor wifi interfaces, the Radiotap header now has different fields.


node_pcap

Join the chat at https://gitter.im/mranney/node_pcap Build StatusCoverage Status

This is a set of bindings from libpcap to node as well as some useful libraries to decode, print, and analyze packets. libpcap is a packet capture library used by programs like tcpdump and wireshark. It has been tested on OSX and Linux.

node_pcap is useful for many things, but it does not yet understand all common protocols. Common reasons to use this package are http_trace (works only on node 4), and htracr.

Why capture packets in JavaScript?

There are already many tools for capturing, decoding, and analyzing packets. Many of them are thoroughly tested and very fast. Why would anybody want to do such low level things like packet capture and analysis in JavaScript? A few reasons:

Installation

You will need libpcap installed. Most OSX machines seem to have it. All major Linux distributions have it available either by default or with a package like libpcap-dev.

The easiest way to get node_pcap and its tools is with npm:

npm install pcap

If you want to hack on the source code, you can get it from github. Clone the repo like this:

git clone git://github.com/node-pcap/node_pcap.git

To compile the native code bindings, do this:

cd node_pcap
node-gyp configure build

Assuming it built without errors, you should be able to run the examples and then write your own packet capture programs.

Usage

There are several example programs that show how to use node_pcap. These examples are best documentation. Try them out and see what they do.

Starting a capture session

To start a capture session, call pcap.createSession with an interface name and a pcap filter string:

var pcap = require('pcap'),
    pcap_session = pcap.createSession(device_name, options);

device_name is the name of the network interface on which to capture packets. If passed an empty string, libpcap will try to pick a "default" interface, which is often just the first one in some list and not what you want.

The options object accepts the following properties:

Note that by default node_pcap opens the interface in promiscuous mode, which generally requires running as root. Unless you are recklessly roaming about as root already, you'll probably want to start your node program like this:

sudo node test.js

Listening for packets

pcap_session is an EventEmitter that emits a packet event. The only argument to the callback will be a PacketWithHeader object containing the raw bytes returned by libpcap:

pcap_session.on('packet', function (raw_packet) {
    // do some stuff with a raw packet
});

This raw_packet contains buf and header (Buffers) and link_type.

To convert raw_packet into a JavaScript object that is easy to work with, decode it:

var packet = pcap.decode.packet(raw_packet);

The protocol stack is exposed as a nested set of objects. For example, the TCP destination port is part of TCP which is encapsulated within IP, which is encapsulated within a link layer. Each layer is contained within the payload attribute of the upper layer (or the packet itself):

packet.payload.payload.payload.dport

This structure is easy to explore with util.inspect.

However, if you decide to parse raw_packet.buf yourself, make sure to truncate it to the first caplen bytes first.

TCP Analysis

TCP can be analyzed by feeding the packets into a TCPTracker and then listening for session and end events.

var pcap = require('pcap'),
    tcp_tracker = new pcap.TCPTracker(),
    pcap_session = pcap.createSession('en0', { filter: "ip proto \\tcp" });

tcp_tracker.on('session', function (session) {
  console.log("Start of session between " + session.src_name + " and " + session.dst_name);
  session.on('end', function (session) {
      console.log("End of TCP session between " + session.src_name + " and " + session.dst_name);
  });
});

pcap_session.on('packet', function (raw_packet) {
    var packet = pcap.decode.packet(raw_packet);
    tcp_tracker.track_packet(packet);
});

You must only send IPv4 TCP packets to the TCP tracker. Explore the session object with sys.inspect to see the wonderful things it can do for you. Hopefully the names of the properties are self-explanatory:

See http_trace for an example of how to use these events to decode HTTP (Works only on node 4).

Other operations

To know the format of the link-layer headers, use pcap_session.link_type or raw_packet.link_type. The property is a LINKTYPE_<...> string, see this list.

To get current capture statistics, use pcap_session.stats(). This returns an object with the following properties:

For more info, see pcap_stats.

If you no longer need to receive packets, you can use pcap_session.close().

To read packets from a file instead of from a live interface, use createOfflineSession instead:

pcap.createOfflineSession('/path/to/capture.pcap', options);

Where options only accepts the filter property.

Some Common Problems

TCP Segmentation Offload - TSO

TSO is a technique that modern operating systems use to offload the burden of IP/TCP header computation to the network hardware. It also reduces the number of times that data is moved data between the kernel and the network hardware. TSO saves CPU when sending data that is larger than a single IP packet.

This is amazing and wonderful, but it does make some kinds of packet sniffing more difficult. In many cases, it is important to see the exact packets that are sent, but if the network hardware is sending the packets, these are not available to libpcap. The solution is to disable TSO.

OSX:

sudo sysctl -w net.inet.tcp.tso=0

Linux (substitute correct interface name):

sudo ethtool -K eth0 tso off

The symptoms of needing to disable TSO are messages like, "Received ACK for packet we didn't see get sent".

IPv6

Sadly, node_pcap does not know how to decode IPv6 packets yet. Often when capturing traffic to localhost, IPv6 traffic will arrive surprisingly, even though you were expecting IPv4. A common case is the hostname localhost, which many client programs will resolve to the IPv6 address ::1 and then will try 127.0.0.1. Until we get IPv6 decode support, a libpcap filter can be set to only see IPv4 traffic:

sudo http_trace lo0 "ip proto \tcp"

The backslash is important. The pcap filter language has an ambiguity with the word "tcp", so by escaping it, you'll get the correct interpretation for this case.

Dropped packets

There are several levels of buffering involved in capturing packets. Sometimes these buffers fill up, and you'll drop packets. If this happens, it becomes difficult to reconstruct higher level protocols. The best way to keep the buffers from filling up is to use pcap filters to only consider traffic that you need to decode. The pcap filters are very efficient and run close to the kernel where they can process high packet rates.

If the pcap filters are set correctly and libpcap still drops packets, you can increase the bufferSize option. To check if there's any packet loss, you can use pcap_session.stats() as indicated above.

Handling warnings

libpcap may sometimes emit warnings (for instance, when an interface has no address). By default these are printed to the console, but you can override the warning handler with your own function:

pcap.warningHandler = function (text) {
    // ...
}

Examples

redis_trace

http_trace (Node 4 only)