acep-uaf / camio-meter-streams

0 stars 0 forks source link

SEL-735 Meter Event Data Pipeline

This repository contains a set of Bash scripts that make up a data pipeline, designed to automate the process of interacting with an SEL-735 meter. The pipeline is divided into two main executable scripts:

  1. data_pipeline.sh: Handles the first four steps:

    • Connecting to the meter via FTP
    • Downloading new files
    • Organizing and creating metadata
    • Compressing data
  2. archive_pipeline.sh: Handles the final step:

    • Archiving and transferring event data to the Data Acquisition System (DAS)

Prerequisites

Ensure you have the following before running the pipeline:

Installation

  1. You must be connected to the camio-ot-dev server. See camio-ot-dev(SSH) in the ACEP Wiki.

  2. Clone the repository:

    git clone git@github.com:acep-uaf/camio-meter-streams.git
    cd camio-meter-streams/cli_meter

    Note: You can check your SSH connection with ssh -T git@github.com

Configuration

Data Pipeline Configuration

  1. Navigate to the config directory and copy the config.yml.example file to a new config.yml file:

    cd config
    cp config.yml.example config.yml
  2. Update the config.yml file with the FTP credentials and meter configuration data.

  3. Secure the config.yml file so that only the owner can read and write:

    chmod 600 config.yml

Archive Pipeline Configuration

  1. Navigate to the config directory and copy the archive_config.yml.example file to a new archive_config.yml file:

    cd config
    cp archive_config.yml.example archive_config.yml
  2. Update the archive_config.yml file with the source and destination directories and details.

  3. Secure the archive_config.yml file so that only the owner can read and write:

    chmod 600 archive_config.yml

Execution

To run the data pipeline and then transfer data to the Data Acquisition System (DAS):

  1. Run the Data Pipeline First

    Execute the data_pipeline script from the cli_meter directory. The script requires a configuration file specified via the -c/--config flag. If this is your first time running the pipeline, the initial download may take a few hours. To pause the download safely, see: How to Stop the Pipeline

    Command

    ./data_pipeline.sh -c /path/to/config.yml
  2. Run the Archive Pipeline

    After the data_pipeline script completes, execute the archive_pipeline script from the cli_meter directory. The script requires a configuration file specified via the -c/--config flag.

    Command

    ./archive_pipeline.sh -c /path/to/archive_config.yml

    Notes

    The rsync uses the --exclude flag to exclude the working directory to ensure only complete files are transfered.

  3. Run the Cleanup Process (Conditional)

    If the archive_pipeline script completes successfully and the enable_cleanup flag is set to true in the archive configuration file, the cleanup.sh script will be executed automatically. This script removes outdated event files from level0 based on the retention period specified in the configuration file.

    Notes

    Ensure that the cleanup.sh script is configured correctly in the archive_config.yml file to specify the retention period for each directory set for the cleanup process.

How to Stop the Pipeline

When you need to stop the pipeline:

Testing

This repository includes automated tests for the scripts using Bats (Bash Automated Testing System). The tests are located in the test directory.

Bats Documentation

Prerequisites

Ensure you have bats-core installed. You can install it using the following steps:

  1. On Ubuntu/Debian:

    sudo apt-get update
    sudo apt-get install -y bats
  2. On macOS (using Homebrew):

    brew install bats-core

Running the Tests

  1. Navigate to the project directory:

    cd /path/to/camio-meter-streams/cli_meter
  2. Run all the tests:

    bats test/
  3. Run specific test files:

    bats test/test_data_pipeline.bats
    bats test/test_commons.bats