libprotoident 2.0.15
Copyright (c) 2011-2020 The University of Waikato, Hamilton, New Zealand. All rights reserved.
See the file COPYING and COPYING.LESSER for full licensing details for this software.
Report and bugs, questions or comments to contact@wand.net.nz
NEW: You can now lodge bugs by filing an issue on the libprotoident github: https://github.com/wanduow/libprotoident
Authors: Shane Alcock
With contributions from: Donald Neal Aaron Murrihy Paweł Foremski pjf@iitis.pl Fabian Weisshaar elnappo@nerdpol.io Jeroen Roovers Jiri Havranek Romain Fontugne Jacob van Walraven
Libprotoident is a library designed to perform application protocol identification using a very limited form of deep packet inspection, i.e. using the first four bytes of application payload sent in each direction. The library provides a simple API that will enable programmers to develop their own tools that utilise application protocol information and we have also included some tools that can be used to perform simple analysis of traffic flows.
libtrace 4.0.1 or later
libflowmanager 3.0.0 or later
After having installed the required libraries, running the following series of commands should install libprotoident
./bootstrap.sh (only if you've cloned the source from GitHub)
./configure
make
make install
By default, libprotoident installs to /usr/local - this can be changed by
appending the --prefix=
The libprotoident tools are built by default - this can be changed by using the --with-tools=no option with ./configure.
A full list of supported protocols can be found at https://github.com/wanduow/libprotoident/wiki/SupportedProtocols
Libprotoident also currently has rules for several "mystery" protocols. These are patterns that commonly occur in our trace sets that we cannot tie to an actual protocol. It would be nice to know what these protocols actually are - if you have any suggestions please feel free to email us at contact@wand.net.nz.
In addition, a flow can be assigned into a "category" based on the protocol determined by libprotoident, enabling broader analysis. For example, BitTorrent, Gnutella and eMule all fall into the P2P category, whereas SMTP, POP3 and IMAP are part of the Mail category.
There are three tools included with libprotoident.
lpi_protoident
Description:
This tool attempts to identify each individual flow within the provided trace. Identification only occurs when the flow has concluded or expired, so it is not very effective for real-time applications.
Usage: lpi_protoident
The input trace must be a valid libtrace URI. See https://github.com/LibtraceTeam/libtrace/wiki/Supported-Trace-Formats to learn more about libtrace URIs. Note that a URI may be a live source, such as a network interface.
Output: For each flow in the input trace, a single line is printed to stdout describing the flow. The line contains the following fields separated by spaces (in order):
lpi_find_unknown
Description:
This tool reports all the flows in a trace which libprotoident was unable to identify. Identification only occurs when the flow has concluded or expired, so it is not very effective for real-time applications.
This is mainly intended as a tool to aid development of new protocol identifiers.
Usage: lpi_find_unknown
The input trace must be a valid libtrace URI. See https://github.com/LibtraceTeam/libtrace/wiki/Supported-Trace-Formats to learn more about libtrace URIs. Note that a URI may be a live source, such as a network interface.
Output: For each unknown flow in the input trace, a single line is printed to stdout describing the flow. The line contains the following fields separated by spaces (in order):
lpi_arff
Description: This tool is similar to lpi_protoident except that it writes its output in the ARFF format so that it is compatible with the Weka machine learning software (http://www.cs.waikato.ac.nz/ml/weka/).
This tool was contributed by Paweł Foremski <pjf@iitis.pl>.
Usage: lpi_arff
The input trace must be a valid libtrace URI. See https://github.com/LibtraceTeam/libtrace/wiki/Supported-Trace-Formats to learn more about libtrace URIs. Note that a URI may be a live source, such as a network interface.
Output: The output begins with a series of lines describing each feature that will be used to describe each flow. Following that, for each flow in the input trace, a single line is printed to stdout describing the flow. The line contains the following fields separated by commas (in order):
If you want to develop your own tools based on libprotoident, you'll need to use the libprotoident API. The API is very simple and the best way to learn it is to examine how the existing tools work. The source for the tools is located in the tools/ directory.
The tools use libflowmanager to do the flow tracking, using an instance of a FlowManager class. You will probably want to incorporate this into your own tool. Usage of libprotoident itself is through functions beginning with 'lpi_'.
The libprotoident API functions themselves are documented in lib/libprotoident.h if you need further guidance.
Further documentation of the API can also be found at https://github.com/LibtraceTeam/libflowmanager
If all else fails, drop me a line at shane@alcock.co.nz