zepor commented 1 month ago
            ○ Time: 1 week
    ○ Tools Required: Azure Monitor, Prometheus, Grafana
    ○ Steps:
        1. Set up real-time monitoring using Azure Monitor.
            □ Configure dashboards and alerts for monitoring data flows.
        2. Integrate Prometheus for collecting metrics.
            □ Set up Prometheus for collecting and storing metrics.
        3. Configure Grafana dashboards for real-time insights.
            □ Develop and configure Grafana dashboards for data quality metrics visualization.
        4. Store monitoring credentials in GitHub Secrets.
            □ Secrets Needed: PROMETHEUS_ENDPOINT, GRAFANA_API_KEY
    ○ Documentation:
        § Monitoring setup and configuration guides.
        § Grafana dashboard configurations and sample visualizations.
    ○ Major Milestone: Data quality monitoring systems set up.
    ○ GitHub Issue:

Set up Monitoring Systems for Data Quality Metrics

Description: Implement monitoring systems for tracking data quality. Tasks:

codeautopilot[bot] commented 1 month ago

Potential solution

To set up a comprehensive monitoring system for data quality metrics using Azure Monitor, Prometheus, and Grafana, we need to follow a structured approach. This involves setting up real-time monitoring with Azure Monitor, integrating Prometheus for metrics collection, configuring Grafana for visualization, and securely storing credentials in GitHub Secrets. The solution will be implemented through a combination of documentation, scripts, and GitHub Actions workflows.

How to implement

Step 1: Documentation for GitHub Secrets

File: backend-container/docs/

Create detailed documentation on how to store PROMETHEUS_ENDPOINT and GRAFANA_API_KEY in GitHub Secrets.

## Storing Credentials in GitHub Secrets

### Overview
This guide will walk you through the process of storing sensitive credentials, such as `PROMETHEUS_ENDPOINT` and `GRAFANA_API_KEY`, in GitHub Secrets. GitHub Secrets allow you to securely store and manage sensitive information needed by your workflows.

### Steps to Store Secrets

1. **Navigate to Your Repository on GitHub:**
   - Open your web browser and go to your GitHub repository.

2. **Access the Settings:**
   - Click on the `Settings` tab located at the top of the repository page.

3. **Go to Secrets:**
   - In the left sidebar, click on `Secrets` under the `Security` section.

4. **Add a New Secret:**
   - Click on the `New repository secret` button.

5. **Enter the Secret Name and Value:**
     - In the `Name` field, enter `PROMETHEUS_ENDPOINT`.
     - In the `Value` field, enter the endpoint URL for Prometheus.
     - In the `Name` field, enter `GRAFANA_API_KEY`.
     - In the `Value` field, enter the API key for Grafana.

6. **Save the Secret:**
   - Click the `Add secret` button to save the secret.

### Example

1. Navigate to your repository on GitHub.
2. Click on `Settings`.
3. In the left sidebar, click on `Secrets`.
4. Click on `New repository secret`.
5. Enter `PROMETHEUS_ENDPOINT` in the `Name` field.
6. Enter the Prometheus endpoint URL in the `Value` field.
7. Click `Add secret`.

#### Adding `GRAFANA_API_KEY`
1. Navigate to your repository on GitHub.
2. Click on `Settings`.
3. In the left sidebar, click on `Secrets`.
4. Click on `New repository secret`.
5. Enter `GRAFANA_API_KEY` in the `Name` field.
6. Enter the Grafana API key in the `Value` field.
7. Click `Add secret`.

### Using Secrets in GitHub Actions

Once the secrets are stored, you can use them in your GitHub Actions workflows. Here is an example of how to reference these secrets in a workflow file:

name: Monitoring Setup

on: [push]

    runs-on: ubuntu-latest
      - name: Checkout code
        uses: actions/checkout@v2

      - name: Set up Monitoring Systems
          GRAFANA_API_KEY: ${{ secrets.GRAFANA_API_KEY }}
        run: |
          # Your setup script here
          echo "Setting up monitoring systems..."


By following these steps, you can securely store and manage your PROMETHEUS_ENDPOINT and GRAFANA_API_KEY in GitHub Secrets, ensuring that sensitive information is protected while being accessible to your workflows.

## Step 2: GitHub Actions Workflow
### File: `backend-container/.github/workflows/monitoring_setup.yml`
Create a GitHub Actions workflow to automate the setup of Azure Monitor, Prometheus, and Grafana, and store credentials in GitHub Secrets.

name: Monitoring Setup

      - main

    runs-on: ubuntu-latest

    - name: Checkout repository
      uses: actions/checkout@v2

    - name: Set up Azure Monitor
      run: |
        curl -sL | sudo bash
        echo "${{ secrets.AZURE_CREDENTIALS }}" | az login --service-principal --username <client-id> --password <client-secret> --tenant <tenant-id>
        az monitor metrics alert create --name "DataQualityAlert" --resource-group <resource-group> --scopes <resource-id> --condition "avg Percentage CPU > 90" --window-size 5m --evaluation-frequency 1m --action <action-group>

    - name: Set up Prometheus
      run: |
        tar xvfz prometheus-*.tar.gz
        cd prometheus-*
        ./prometheus --config.file=prometheus.yml &
        echo "::set-output name=prometheus_endpoint::http://localhost:9090"

    - name: Set up Grafana
      run: |
        sudo apt-get install -y adduser libfontconfig1
        sudo dpkg -i grafana_7.5.2_amd64.deb
        sudo systemctl start grafana-server
        sudo systemctl enable grafana-server
        curl -X POST -H "Content-Type: application/json" -d '{"name":"DataQualityDashboard","type":"prometheus","url":"http://localhost:9090","access":"proxy","basicAuth":false}' http://admin:admin@localhost:3000/api/datasources
        GRAFANA_API_KEY=$(curl -X POST -H "Content-Type: application/json" -d '{"name":"api_key","role":"Admin"}' http://admin:admin@localhost:3000/api/auth/keys | jq -r '.key')
        echo "::set-output name=grafana_api_key::$GRAFANA_API_KEY"

    - name: Store credentials in GitHub Secrets
      uses: google/secrets-sync-action@v1
        secrets: |
          PROMETHEUS_ENDPOINT=${{ steps.setup-monitoring.outputs.prometheus_endpoint }}
          GRAFANA_API_KEY=${{ steps.setup-monitoring.outputs.grafana_api_key }}

Step 3: Monitoring Setup Script

File: backend-container/src/utils/

Implement the setup script for Azure Monitor, Prometheus, and Grafana.

import os
import requests

def setup_azure_monitor():
    print("Setting up Azure Monitor...")

def setup_prometheus():
    print("Setting up Prometheus...")

def setup_grafana():
    grafana_api_key = os.getenv('GRAFANA_API_KEY')
    prometheus_endpoint = os.getenv('PROMETHEUS_ENDPOINT')

    if not grafana_api_key or not prometheus_endpoint:
        raise ValueError("GRAFANA_API_KEY and PROMETHEUS_ENDPOINT must be set in environment variables")

    headers = {
        'Authorization': f'Bearer {grafana_api_key}',
        'Content-Type': 'application/json'

    data_source_payload = {
        "name": "Prometheus",
        "type": "prometheus",
        "url": prometheus_endpoint,
        "access": "proxy",
        "basicAuth": False

    response ='http://localhost:3000/api/datasources', json=data_source_payload, headers=headers)
    if response.status_code == 200:
        print("Grafana data source created successfully")
        print(f"Failed to create Grafana data source: {response.content}")

    print("Setting up Grafana dashboards...")

def main():

if __name__ == "__main__":

Step 4: Script for Storing GitHub Secrets

File: backend-container/src/utils/

Implement the script to store PROMETHEUS_ENDPOINT and GRAFANA_API_KEY in GitHub Secrets.

import base64
import json
import requests
from cryptography.hazmat.primitives.asymmetric import rsa
from cryptography.hazmat.primitives import serialization, hashes
from cryptography.hazmat.primitives.asymmetric import padding

GITHUB_REPO = "your-username/your-repo"
GITHUB_TOKEN = "your-github-token"
PROMETHEUS_ENDPOINT = "your-prometheus-endpoint"
GRAFANA_API_KEY = "your-grafana-api-key"

def get_public_key():
    url = f"{GITHUB_REPO}/actions/secrets/public-key"
    headers = {
        "Authorization": f"token {GITHUB_TOKEN}",
        "Accept": "application/vnd.github.v3+json"
    response = requests.get(url, headers=headers)
    return response.json()

def encrypt_secret(public_key, secret_value):
    public_key_bytes = base64.b64decode(public_key["key"])
    public_key = serialization.load_der_public_key(public_key_bytes)
    encrypted = public_key.encrypt(
    return base64.b64encode(encrypted).decode()

def store_secret(secret_name, encrypted_value, key_id):
    url = f"{GITHUB_REPO}/actions/secrets/{secret_name}"
    headers = {
        "Authorization": f"token {GITHUB_TOKEN}",
        "Accept": "application/vnd.github.v3+json"
    data = {
        "encrypted_value": encrypted_value,
        "key_id": key_id
    response = requests.put(url, headers=headers, data=json.dumps(data))

def main():
    public_key = get_public_key()
    key_id = public_key["key_id"]

    encrypted_prometheus_endpoint = encrypt_secret(public_key, PROMETHEUS_ENDPOINT)
    encrypted_grafana_api_key = encrypt_secret(public_key, GRAFANA_API_KEY)

    store_secret("PROMETHEUS_ENDPOINT", encrypted_prometheus_endpoint, key_id)
    store_secret("GRAFANA_API_KEY", encrypted_grafana_api_key, key_id)

if __name__ == "__main__":

Step 5: Monitoring Setup Documentation

File: backend-container/docs/

Create comprehensive documentation for setting up and configuring Azure Monitor, Prometheus, and Grafana.

# Monitoring Setup and Configuration Guide

## Introduction
This document provides a step-by-step guide to set up and configure Azure Monitor, Prometheus, and Grafana for monitoring data quality metrics.

## Prerequisites
- Azure account with necessary permissions
- Prometheus installed and running
- Grafana installed and running
- GitHub repository with access to GitHub Secrets

## Setting Up Azure Monitor

### Create an Azure Monitor Resource
1. Navigate to the Azure portal.
2. Create a new Azure Monitor resource.
3. Configure the resource to monitor your data flows.

### Configure Dashboards and Alerts
1. Go to the Azure Monitor resource.
2. Set up dashboards to visualize data flows.
3. Configure alerts to notify you of any anomalies or issues.

## Integrating Prometheus

### Install Prometheus
Follow the [official Prometheus installation guide]( for your operating system.

### Configure Prometheus
Edit the `prometheus.yml` configuration file to scrape metrics from your data sources.

Example configuration:
  - job_name: 'data_quality_metrics'
      - targets: ['<your_data_source_endpoint>']

Start Prometheus

Run Prometheus using the configured prometheus.yml file.

Setting Up Grafana

Install Grafana

Follow the official Grafana installation guide for your operating system.

Configure Grafana Data Source

  1. Log in to Grafana.
  2. Add Prometheus as a data source:
    • Navigate to Configuration > Data Sources.
    • Add a new data source and select Prometheus.
    • Enter the Prometheus endpoint URL.

Create Grafana Dashboards

  1. Create new dashboards to visualize data quality metrics.
  2. Example panels to include:
    • Data flow rates
    • Error rates
    • Data completeness

Sample Visualizations

Provide screenshots or JSON configurations of sample dashboards.

Storing Monitoring Credentials in GitHub Secrets

Add Secrets to GitHub

  1. Navigate to your GitHub repository.
  2. Go to Settings > Secrets and variables > Actions.
  3. Add the following secrets:


By following this guide, you will have a robust monitoring system in place to track data quality metrics using Azure Monitor, Prometheus, and Grafana.

Additional Resources

By following these steps, you will have a comprehensive setup for monitoring data quality metrics using Azure Monitor, Prometheus, and Grafana, with secure storage of credentials in GitHub Secrets.

