Open anitsh opened 4 years ago
SNAS (Streaming Network Analytics System)
The Streaming Network Analytics System (SNAS) is an open source Network Analytics project hosted by The Linux Foundation. According to its SNAS' website,
SNAS "is a framework to collect, track and access tens of millions of routing objects (routers, peers, prefixes) in real time".
SNAS formed from OpenBMP (BGP Monitoring Protocol), which was an open source project for monitoring the BGP states. SNAS aims to have a bigger scope than only monitoring BGP. However, SNAS is still in the early stages, and currently it only supports BGP analytics via its OpenBMP base. The BGP Monitoring Protocol (BMP) is a protocol for monitoring BGP sessions and objects. It is defined in RFC 7854.
SNAS is based on Apache Kafka and the OpenBMP server to receive the BMP messages. It uses MySQL to store the states and BGP information.
SNAS collects BMP information from routers, stores it and provides live analytics and visualization. Also, it has Java and Python APIs that you can use to easily read BMP data from SNAS in a fully parsed format, such as JSON.
SNAS Architecture
SNAS has a web GUI which is currently packaged as a Docker container. The Browser UI has many analytics modules, such as Prefix Analytics and History, Peer Analysis, and ASN Visualization.
You can also use Grafana to interact with stored SNAS data, as well as the analytical and time series data in the SNAS MySQL database.
PNDA
PNDA, or Platform for Network Data Analytics, is a big data analytics platform designed for Network Data Analytics. PNDA is based on multiple open source technologies, such as Apache Kafka, Hadoop, Apache HBase, Grafana, Logstash, Oozie, etc.
PNDA simplifies the complexities of big data systems, provides a simple platform for networking professionals to load the networking data into PNDA and build analytics applications and integrate the underlying big data systems.
PNDA features:
Open source platform for Network Data Analytics
Aggregates data like logs, metrics and network telemetry
Scales up to consume millions of messages per second
Efficiently distributes data with a publisher and a subscriber model
Processes bulk data in batches, or streams data in real-time
Manages the lifecycle of applications that process and analyze data
Lets you explore data using interactive notebooks.
PNDA has a 3-tier architecture:
Log ingestion plugin to get data into PNDA
Analysis engine, which includes data distribution, parsers, storage, big data queries, and data visualization
Consumer application - PNDA applications that utilize the PNDA analytical information to produce specific use cases for the user.
PNDA decouples the data sources from applications and data consumers. This helps you to ingest a stream of data once, and allow multiple applications to reuse and consume the same data to produce different use cases.
PNDA Principles and Benefits
In today’s world of big data and analytics, there are multiple systems and solutions involved from the time a log file is uploaded to the big data platform, until the final telemetry application that visualizes and provides the output. Systems are mostly working in silos, and you need to deploy multiple systems in order to get the final outcomes. Big data platforms may not be known to networking professionals. Therefore, PNDA aims to make a simple pipeline analytical platform for networking to ease reaching to the desired outcomes.
Siloed Analytics Pipelines
PNDA removes the complexity of big data platforms, and provides a simple platform that can be used to develop your analytics application.
BGP Analytics Application (Example) In this sample application, BGP data can be analyzed and used for network analytics.
BGP Analytics Pipeline
This application starts with BGP routers sending BGP logs to the OpenBMP collector via the BMP protocol. Later, the logs are sent to a Logstash instance to be ingested into Apache Kafta within PNDA for distribution. Once Kafka receives the raw data, the Goblin starts taking data from Kafta data distribution and ingest the data into HDFS (Hadoop Distributed File System). HDFS is PNDA's file datastore, which stores raw data.
PNDA big data applications are written in Apache Spark for analysis of raw data. You can write your application to start parsing the raw data and get information out of raw data logs. Once the Spark application completes its processing and generates the analytics information, it starts to push its output data into an OpenTSDB database. OpenTSDB is a metric data store.
Finally, to visualize the analytics data, the UI (user interface) queries the data which is stored in OpenTSDB. “Impala” presents an SQL interface from OpenTSDB to the UI to allow data retrieval.
From a developer perspective, you only need to write the application for Apache Spark to parse the raw data and extract the information, and finally do the analytical calculation to store the results in OpenTSDB.
Learning Objectives (Review)
You should now be able to:
Discuss open source analytics and monitoring tools: PNDA and SNAS.
Summary
In this chapter, we learned about two open source projects that can be used for analytics and extraction of information out of the network data:
SNAS
A platform for real-time analysis of networking objects, with the main use case of BGP analysis.
PNDA
A big data platform used to analyze raw network data and extract analytical information.
Learning Objectives
By the end of this chapter, you should be able to: