Chapter 11. Application Layer

anitsh commented 4 years ago

Learning Objectives

By the end of this chapter, you should be able to:

        Discuss open source analytics and monitoring tools: PNDA and SNAS.

anitsh commented 4 years ago

SNAS (Streaming Network Analytics System)

The Streaming Network Analytics System (SNAS) is an open source Network Analytics project hosted by The Linux Foundation. According to its SNAS' website,

SNAS "is a framework to collect, track and access tens of millions of routing objects (routers, peers, prefixes) in real time".

SNAS formed from OpenBMP (BGP Monitoring Protocol), which was an open source project for monitoring the BGP states. SNAS aims to have a bigger scope than only monitoring BGP. However, SNAS is still in the early stages, and currently it only supports BGP analytics via its OpenBMP base. The BGP Monitoring Protocol (BMP) is a protocol for monitoring BGP sessions and objects. It is defined in RFC 7854.

SNAS is based on Apache Kafka and the OpenBMP server to receive the BMP messages. It uses MySQL to store the states and BGP information.

SNAS collects BMP information from routers, stores it and provides live analytics and visualization. Also, it has Java and Python APIs that you can use to easily read BMP data from SNAS in a fully parsed format, such as JSON.

SNAS Architecture

SNAS has a web GUI which is currently packaged as a Docker container. The Browser UI has many analytics modules, such as Prefix Analytics and History, Peer Analysis, and ASN Visualization.

You can also use Grafana to interact with stored SNAS data, as well as the analytical and time series data in the SNAS MySQL database.

anitsh commented 4 years ago

PNDA

PNDA, or Platform for Network Data Analytics, is a big data analytics platform designed for Network Data Analytics. PNDA is based on multiple open source technologies, such as Apache Kafka, Hadoop, Apache HBase, Grafana, Logstash, Oozie, etc.

PNDA simplifies the complexities of big data systems, provides a simple platform for networking professionals to load the networking data into PNDA and build analytics applications and integrate the underlying big data systems.

PNDA features:

        Open source platform for Network Data Analytics
        Aggregates data like logs, metrics and network telemetry
        Scales up to consume millions of messages per second
        Efficiently distributes data with a publisher and a subscriber model
        Processes bulk data in batches, or streams data in real-time
        Manages the lifecycle of applications that process and analyze data
        Lets you explore data using interactive notebooks.

PNDA has a 3-tier architecture:

        Log ingestion plugin to get data into PNDA
        Analysis engine, which includes data distribution, parsers, storage, big data queries, and data visualization
        Consumer application - PNDA applications that utilize the PNDA analytical information to produce specific use cases for the user.

PNDA decouples the data sources from applications and data consumers. This helps you to ingest a stream of data once, and allow multiple applications to reuse and consume the same data to produce different use cases.

PNDA Principles and Benefits

In today’s world of big data and analytics, there are multiple systems and solutions involved from the time a log file is uploaded to the big data platform, until the final telemetry application that visualizes and provides the output. Systems are mostly working in silos, and you need to deploy multiple systems in order to get the final outcomes. Big data platforms may not be known to networking professionals. Therefore, PNDA aims to make a simple pipeline analytical platform for networking to ease reaching to the desired outcomes.

Siloed Analytics Pipelines

PNDA removes the complexity of big data platforms, and provides a simple platform that can be used to develop your analytics application.

BGP Analytics Application (Example) In this sample application, BGP data can be analyzed and used for network analytics.

BGP Analytics Pipeline

This application starts with BGP routers sending BGP logs to the OpenBMP collector via the BMP protocol. Later, the logs are sent to a Logstash instance to be ingested into Apache Kafta within PNDA for distribution. Once Kafka receives the raw data, the Goblin starts taking data from Kafta data distribution and ingest the data into HDFS (Hadoop Distributed File System). HDFS is PNDA's file datastore, which stores raw data.

PNDA big data applications are written in Apache Spark for analysis of raw data. You can write your application to start parsing the raw data and get information out of raw data logs. Once the Spark application completes its processing and generates the analytics information, it starts to push its output data into an OpenTSDB database. OpenTSDB is a metric data store.

Finally, to visualize the analytics data, the UI (user interface) queries the data which is stored in OpenTSDB. “Impala” presents an SQL interface from OpenTSDB to the UI to allow data retrieval.

From a developer perspective, you only need to write the application for Apache Spark to parse the raw data and extract the information, and finally do the analytical calculation to store the results in OpenTSDB.

anitsh commented 4 years ago

Learning Objectives (Review)

You should now be able to:

        Discuss open source analytics and monitoring tools: PNDA and SNAS.

anitsh commented 4 years ago

Summary

In this chapter, we learned about two open source projects that can be used for analytics and extraction of information out of the network data:

        SNAS
        A platform for real-time analysis of networking objects, with the main use case of BGP analysis.
        PNDA
        A big data platform used to analyze raw network data and extract analytical information.

readersclub / opensource-networking-technologies

Chapter 11. Application Layer #12