afaisma commented 12 months ago

Provider configuration

provider "aws" { region = "us-west-2" # Specify your AWS region }

Create an S3 bucket for use with AWS Transcribe services

resource "aws_s3_bucket" "transcribe_bucket" { bucket = "my-transcribe-bucket" # Replace with a unique bucket name acl = "private" # Set the ACL as per your requirement }

Create an IAM role that can be assumed by AWS Transcribe services

resource "aws_iam_role" "transcribe_role" { name = "transcribe_role" # Name of the IAM role

Trust relationship policy for AWS Transcribe and Transcribe Streaming

assume_role_policy = jsonencode({ Version = "2012-10-17", Statement = [ { Action = "sts:AssumeRole", Effect = "Allow", Principal = { Service = ["transcribe.amazonaws.com", "transcribestreaming.amazonaws.com"] } } ] }) }

Policy document for full access to the created S3 bucket

data "aws_iam_policy_document" "s3_full_access" { statement { actions = ["s3:"] # Grant all S3 actions resources = [ aws_s3_bucket.transcribe_bucket.arn, # Reference to the created bucket "${aws_s3_bucket.transcribe_bucket.arn}/" # Reference to all objects in the bucket ] } }

Create the IAM policy from the above document

resource "aws_iam_policy" "s3_full_access_policy" { name = "s3_full_access_policy" policy = data.aws_iam_policy_document.s3_full_access.json }

Attach the S3 full access policy to the IAM role

resource "aws_iam_role_policy_attachment" "s3_attach" { role = aws_iam_role.transcribe_role.name policy_arn = aws_iam_policy.s3_full_access_policy.arn }

afaisma commented 12 months ago

terraform { required_version = ">= 0.14" required_providers { atlas = { source = "jpmchase.net/terraform/atlas-aws" } } }

module "jpm_data" { source = "tfe.jpmchase.net/ATLAS-MODULE-REGISTRY/jpm-data/aws" version = "8.0.0" }

provider "aws" { access_key = var.aws_access_key_id secret_key = var.aws_secret_access_key token = var.aws_session_token region = var.aws_region sts_region = "us-east-1" assume_role { role_arn = "arn:aws:iam::${var.aws_account_id}:role/tfe-module-pave-apply" } }

provider "atlas" {}

module "kms" { source = "tfe.jpmchase.net/ATLAS-MODULE-REGISTRY/kms/aws" version = "7.0.0" kms_key_alias = "speechplace-kms" kms_key_description = "KMS for speechplace" }

variable "s3_bucket_name" { default = "speechplace-s3-bucket" }

resource "aws_s3_bucket_logging" "example" {

bucket = var.s3_bucket_name

target_bucket = var.s3_bucket_name

target_prefix = "log/"

}

module "speechplace-s3-bucket" { source = "tfe.jpmchase.net/ATLAS-MODULE-REGISTRY/s3/aws" version = "7.27.0" kms_key_arn = module.kms.kms_key_arn core_backups_retention = "NOBACKUP" tags = { Name = var.s3_bucket_name } }

module "role_speechplace" { source = "tfe.jpmchase.net/ATLAS-MODULE-REGISTRY/role/aws" version = "5.5.0" saml_enabled = false assumable_service = ["transcribe.amazonaws.com", "s3.amazonaws.com"] name = "speechplace_role" }

variable "s3_bucket_kms_key_id" {}

module "role" { source = "tfe.jpmchase.net/ATLAS-MODULE-REGISTRY/role/aws" version = "5.3.0" name = "sample-role" assumable_service = ["ec2.amazonaws.com"] }

module "role_policy_updater" { source = "tfe.jpmchase.net/ATLAS-MODULE-REGISTRY/role-policy-updater/aws" version = "39.2.0" role_name = module.role.role_name create_managed_policy = true s3_access = { list_access = true read_access = true write_access = true delete_access = true delete_object_version_access = true bucket_names = [var.s3_bucket_name] }

}

afaisma commented 12 months ago

Terraform Version and Provider Requirements

terraform { required_version = ">= 0.14"

required_providers { aws = { source = "hashicorp/aws" version = "~> 3.0" # Specify the version as per your requirements }

Specify other required providers here if needed

} }

AWS Provider Configuration

provider "aws" { access_key = var.aws_access_key_id secret_key = var.aws_secret_access_key token = var.aws_session_token region = var.aws_region

sts_region = "us-east-1" # Commented out as it's not a standard provider argument

assume_role { role_arn = "arn:aws:iam::${var.aws_account_id}:role/tfe-module-pave-apply" } }

Modules and Other Configurations

The rest of your configuration goes here

Ensure all modules and resources are correctly configured

and that the variables and outputs are defined as needed.

Make sure to define variables like 'aws_access_key_id', 'aws_secret_access_key',

'aws_session_token', 'aws_region', and 'aws_account_id' in your variables.tf file or

pass them in some other way (e.g., via environment variables or a tfvars file).

Also, review any custom modules (like those sourced from 'tfe.jpmchase.net') for correct usage

and compatibility with your Terraform version and provider configurations.

afaisma commented 11 months ago

import csv from datetime import datetime from collections import defaultdict

Define a dictionary to map day of the week to a numerical value

days_of_week = { 'Sunday': 0, 'Monday': 1, 'Tuesday': 2, 'Wednesday': 3, 'Thursday': 4, 'Friday': 5, 'Saturday': 6 }

Define a function to convert the time string to datetime

def parse_time(time_str): try: return datetime.strptime(time_str, '%H:%M') except ValueError: return datetime.strptime(time_str, '%H')

Read the CSV file and parse the data

with open('your_csv_file.csv', 'r') as file: csv_reader = csv.DictReader(file) data = list(csv_reader)

Sort the data by namespace, day of the week (numerical), and time

sorted_data = sorted(data, key=lambda x: (x['namespace'], days_of_week[x['DOW']], parse_time(x['Time'])))

Create a defaultdict to store the sorted records

sorted_map = defaultdict(list)

Populate the sorted_map

for record in sorted_data: namespace = record['namespace'] sorted_map[namespace].append(record)

Sort the records within each namespace by parse_time

for namespace, records in sorted_map.items(): sorted_map[namespace] = sorted(records, key=lambda x: parse_time(x['Time']))

Print the content of the sorted_map

for namespace, records in sorted_map.items(): print(f'Namespace: {namespace}') for record in records: print(f'DOW: {record["DOW"]}, Time: {record["Time"]}, scale_to: {record["scale_to"]}') print()

afaisma commented 11 months ago

def print_list_of_dicts(list_of_dicts): for d in list_of_dicts: print("{") for key, value in d.items(): print(f" {key}: {value}") print("}")

list_of_dicts = [ {"name": "John", "age": 30}, {"name": "Alice", "age": 25}, {"name": "Bob", "age": 35}, ]

print_list_of_dicts(list_of_dicts)

afaisma commented 11 months ago

list_of_dicts = [ {"name": "John", "age": 30}, {"name": "Alice", "age": 25}, {"name": "Bob", "age": 35}, ]

Determine the maximum width for each column

max_name_width = max(len(d["name"]) for d in list_of_dicts) max_age_width = max(len(str(d["age"])) for d in list_of_dicts)

Print the headers

print(f"{'Name':<{max_name_width}} {'Age':<{max_age_width}}")

Print the values with aligned columns

for d in list_of_dicts: print(f"{d['name']:<{max_name_width}} {d['age']:<{max_age_width}}")

afaisma commented 11 months ago

from datetime import datetime

def find_current(namespace, deployment, current_time):

Define a function to calculate the time difference between two datetime objects

def time_difference(dt1, dt2):
    return abs((dt1 - dt2).total_seconds())

# Convert the current_time to a datetime object
current_time_dt = datetime.strptime(current_time, '%H:%M')

# Initialize variables to store the closest record and its time difference
closest_record = None
min_time_difference = float('inf')

# Iterate through the data to find the closest record
for record in data:
    if record['namespace'] == namespace and record['DOW'] == deployment:
        record_time = parse_time(record['Time'])
        diff = time_difference(current_time_dt, record_time)

        # Update closest_record if this record has a smaller time difference
        if diff < min_time_difference:
            min_time_difference = diff
            closest_record = record

return closest_record

Example usage:

namespace = 'p11-rttr-1' deployment = 'Tuesday' current_time = '8:30' result = find_current(namespace, deployment, current_time)

if result: print(f'Closest Record: {result}') else: print('No matching record found.')

afaisma commented 11 months ago

from datetime import datetime

def find_current(namespace, deployment, current_time):

Define a function to calculate the time difference between two datetime objects

def time_difference(dt1, dt2):
    return abs((dt1 - dt2).total_seconds())

# Convert the current_time to a datetime object
current_time_dt = datetime.strptime(current_time, '%A %H:%M')

# Initialize variables to store the closest record and its time difference
closest_record = None
min_time_difference = float('inf')

# Iterate through the data to find the closest record
for record in data:
    if record['namespace'] == namespace and record['DOW'] == deployment:
        record_datetime = datetime.strptime(f"{record['DOW']} {record['Time']}", '%A %H:%M')
        diff = time_difference(current_time_dt, record_datetime)

        # Update closest_record if this record has a smaller time difference
        if diff < min_time_difference:
            min_time_difference = diff
            closest_record = record

return closest_record

Example usage:

namespace = 'p11-rttr-1' deployment = 'Tuesday' current_time = 'Tuesday 8:30' result = find_current(namespace, deployment, current_time)

if result: print(f'Closest Record: {result}') else: print('No matching record found.')

afaisma commented 11 months ago

from datetime import datetime, timedelta

def get_nsec_from_beginning_of_the_week(day_of_week, time_str):

Define a dictionary to map day of the week to its numerical representation

days_of_week = {
    'Monday': 0,
    'Tuesday': 1,
    'Wednesday': 2,
    'Thursday': 3,
    'Friday': 4,
    'Saturday': 5,
    'Sunday': 6
}

# Get the numerical representation of the specified day of the week
dow_num = days_of_week.get(day_of_week)

if dow_num is not None:
    # Calculate the number of seconds from the beginning of the week
    current_time = datetime.strptime(time_str, '%H:%M')
    beginning_of_week = datetime(year=2023, month=1, day=2)  # Assuming Monday is the first day of the week
    beginning_of_week += timedelta(days=dow_num, hours=current_time.hour, minutes=current_time.minute)
    seconds_from_beginning_of_week = (beginning_of_week - datetime(year=2023, month=1, day=1)).total_seconds()
    return seconds_from_beginning_of_the_week
else:
    return None

Example usage:

day_of_week = 'Tuesday' time_str = '18:30' seconds_from_beginning = get_nsec_from_beginning_of_the_week(day_of_week, time_str)

if seconds_from_beginning is not None: print(f'Seconds from the beginning of the week: {seconds_from_beginning}') else: print('Invalid day of the week.')

afaisma commented 11 months ago

Housekeeping Service Architecture Document

Table of Contents

Introduction System Architecture Scaling Instructions Implementation Details Operations Conclusion

Introduction

The Housekeeping Service is designed to efficiently manage the scaling of small housekeeping pods deployed across various deployments and namespaces in a distributed environment. These pods receive scaling instructions from a Parameter Store, which contains CSV files with scaling directives. The service's primary goal is to ensure the desired number of pods per region, cluster, namespace, and deployment at specific times and days of the week.

System Architecture

The architecture of the Housekeeping Service consists of the following key components:

2.1 Housekeeping Pods Small pods deployed across multiple deployments and namespaces. Responsible for monitoring and adjusting the number of pods in response to scaling instructions. 2.2 Parameter Store Centralized repository for storing scaling instructions. Contains CSV files with scaling directives. Directives are organized based on cluster, region, namespace, deployment, day of the week, time, and scale_to fields. 2.3 Scaling Controller Core component responsible for managing scaling operations. Retrieves scaling instructions from the Parameter Store. Analyzes the instructions to determine the most relevant scaling directive for each pod. Initiates scaling up or down based on the instructions. 2.4 Operations Interface Provides a user-friendly interface to perform various operations related to the Parameter Store and scaling directives.

Scaling Instructions

Scaling instructions are stored in CSV format within the Parameter Store. Each instruction includes the following fields:

Cluster: Identifies the cluster to which the instruction applies. Region: Specifies the region where scaling should occur. Namespace: Defines the namespace within which the scaling applies. Deployment: Identifies the specific deployment targeted for scaling. Day of the Week: Specifies the day of the week when the scaling should take place. Time: The time of day when scaling should occur. Scale_to: The desired number of pods per region, cluster, namespace, and deployment.

Implementation Details

The Housekeeping Service is implemented using cloud-native technologies and follows best practices for scalability and reliability. Key implementation details include:

Utilizing cloud services for the Parameter Store and scaling operations. Containerization of housekeeping pods for easy deployment and scaling. Integration with time scheduling mechanisms to trigger scaling actions at specific times.

Operations

The Housekeeping Service provides the following operations for managing scaling instructions and parameters:

5.1 CSV to Parameter Store Parameters Allows users to upload CSV files containing scaling instructions to the Param

afaisma / TestRepo

Testissue #1

Provider configuration

Create an S3 bucket for use with AWS Transcribe services

Create an IAM role that can be assumed by AWS Transcribe services

Trust relationship policy for AWS Transcribe and Transcribe Streaming

Policy document for full access to the created S3 bucket

Create the IAM policy from the above document

Attach the S3 full access policy to the IAM role

resource "aws_s3_bucket_logging" "example" {

bucket = var.s3_bucket_name

target_bucket = var.s3_bucket_name

target_prefix = "log/"

}

variable "s3_bucket_kms_key_id" {}

Terraform Version and Provider Requirements

Specify other required providers here if needed

AWS Provider Configuration

sts_region = "us-east-1" # Commented out as it's not a standard provider argument

Modules and Other Configurations

The rest of your configuration goes here

Ensure all modules and resources are correctly configured

and that the variables and outputs are defined as needed.

Make sure to define variables like 'aws_access_key_id', 'aws_secret_access_key',

'aws_session_token', 'aws_region', and 'aws_account_id' in your variables.tf file or

pass them in some other way (e.g., via environment variables or a tfvars file).

Also, review any custom modules (like those sourced from 'tfe.jpmchase.net') for correct usage

and compatibility with your Terraform version and provider configurations.

Define a dictionary to map day of the week to a numerical value

Define a function to convert the time string to datetime

Read the CSV file and parse the data

Sort the data by namespace, day of the week (numerical), and time

Create a defaultdict to store the sorted records

Populate the sorted_map

Sort the records within each namespace by parse_time

Print the content of the sorted_map

Determine the maximum width for each column

Print the headers

Print the values with aligned columns

Define a function to calculate the time difference between two datetime objects

Example usage:

Define a function to calculate the time difference between two datetime objects

Example usage:

Define a dictionary to map day of the week to its numerical representation

Example usage: