datacenter / aci-monitoring-stack

GNU General Public License v3.0
12 stars 3 forks source link

aci-monitoring-stack - Open Source Monitoring for Cisco ACI

Overview

Harness the power of open source to efficiently monitor your Cisco ACI environment with the ACI-Monitoring-Stack. This lightweight, yet robust, monitoring solution combines top-tier open source tools, each contributing unique capabilities to ensure comprehensive visibility into your ACI infrastructure.

The ACI-Monitoring-Stack integrates the following key components:

Your Stack

To gain a comprehensive understanding of the ACI Monitoring Stack and its components it is helpful to break down the stack into separate functions. Each function focuses on a different aspect of monitoring the Cisco Application Centric Infrastructure (ACI) environment.

Fabric Discovery:

The ACI monitoring stack uses Prometheus Service Discovery (HTTP SD) to dynamically discover and scrape targets by periodically querying a specified HTTP endpoint for a list of target configurations in JSON format.

The ACI Monitoring Stack needs only the IP addresses of the APICs, the Switches will be Auto Discovered. If switches are added or removed from the fabric no action is required from the end user.

    flowchart-elk RL
      P[("Prometheus")]
      A["aci-exporter"]
      APIC["APIC"]

      APIC -- "API Query" --> A
      A -- "HTTP SD" --> P

ACI Object Scraping:

Prometheus scraping is the process by which Prometheus periodically collects metrics data by sending HTTP requests to predefined endpoints on monitored targets. The aci-exporter translates ACI-specific metrics into a format that Prometheus can ingest, ensuring that all crucial data points are captured and monitored effectively.

    flowchart-elk RL
      P[("Prometheus")]
      A["aci-exporter"]
      subgraph ACI
        S["Switches"]
        APIC["APIC"]
      end
      A--"Scraping"-->P
      S--"API Queries"-->A
      APIC--"API Queries"-->A

Syslog Ingestion:

The syslog config is composed of 3 components: promtail, loki and syslog-ng. Prior to ACI 6.1 syslog-ng is required between ACI and Promtail to convert from RFC 3164 to 5424 syslog message format.

    flowchart-elk LR
      L["Loki"]
      PT["Promtail"]
      SL["Syslog-ng"]
      PT-->L
      SL-->PT
      subgraph ACI
        S["Switches"]
        APIC["APIC"]
      end
      V{Ver >= 6.1}
      S--"Syslog"-->V
      APIC--"Syslog"-->V
      V -->|Yes| PT
      V -->|No| SL

Data Visualization

The Data Visualization is handled by Grafana, an open-source analytics and monitoring platform that allows users to visualize, query, and analyze data from various sources through customizable and interactive dashboards. It supports a wide range of data sources, including Prometheus and Loki enabling users to create real-time visualizations, alerts, and reports to monitor system performance and gain actionable insights.

    flowchart-elk RL
      G["Grafana"]
      L["Loki"]
      P[("Prometheus")]
      U["User"]

      P--"PromQL"-->G
      L--"LogQL"-->G
      G-->U

Alerting

Alertmanager is a component of the Prometheus ecosystem designed to handle alerts generated by Prometheus. It manages the entire lifecycle of alerts, including deduplication, grouping, silencing, and routing notifications to various communication channels like email, Webex, Slack, and others, ensuring that alerts are delivered to the right people in a timely and organized manner.

In the ACI Monitoring Stack both Prometheus and Loki are configured with alerting rules.

flowchart-elk LR
  L["Loki"]
  P["Prometheus"]
  AM["Alertmanager"]
  N["Notifications (Mail/Webex etc...)"]
  L --> AM
  P --> AM 
  AM --> N

Demo Environment Access and Use

Stack Deployment Guide

Stack Development Guide