CCI-MOC / openshift-usage-scripts

0 stars 3 forks source link

Blueprint on how to organize this codebase #66

Closed naved001 closed 4 days ago

naved001 commented 3 weeks ago

This has grown organically as the requirements grew. The goal is to come up with the blueprint on how we want to organize and refactor this codebase.

naved001 commented 2 weeks ago

@knikolla This is what I have in my head so far:

First I want to take care of the low hanging fruits by moving some pieces of code into their own modules/class.

# prometheus_client.py
class PrometheusClient:
    def __init__(self, prometheus_url: str, token: str, step_min: int=15):
        self.proemtheus_url = prometheus_url
        self.token = token
        self.step_min = step_min

    def query_metric(self, metric, start_date, end_date):
        """Queries metric from prometheus/thanos for the provided openshift_url"""
        ...
        return data
#s3_client.py
class S3Client:
    def __init__(self):
        #init

    def upload_file(self, file_path: str, s3_path: str) -> None:
        self.s3.upload_file(file_path, Bucket=self.bucket_name, Key=s3_path)

    def download_file(self, file_path: str, s3_path: str) -> None:
        self.s3.download_file(file_path, Bucket=self.bucket_name, Key=s3_path)
 class ColdFrontClient(object):

    def __init__(self, keycloak_url, keycloak_client_id, keycloak_client_secret):
        self.session = self.get_session(keycloak_url,
                                        keycloak_client_id,
                                        keycloak_client_secret)

    @staticmethod
    def get_session(keycloak_url, keycloak_client_id, keycloak_client_secret):
        """Authenticate as a client with Keycloak to receive an access token."""
        token_url = f"{keycloak_url}/auth/realms/mss/protocol/openid-connect/token"
        ...
        return session

Then I want to have a class that's responsible for processing the metrics; this will need to be aware of the times we don't bill for because the duration a pod ran is calculated here.

#metrics_processor.py
class MetricsProcessor(object):
    def __init__(self, merged_metrics: dict = {}, interval_minutes: int = 15):
        self.merged_metrics = merged_metrics
        self.interval_minutes = interval_minutes

    def merge_metrics(self, metric_name, metric_list):
        """
        Merge metrics by pod but since pod names aren't guaranteed to be unique across
        namespaces, we organize by namespace: self.merged_metrics[namespace][pod]
        """
        return self.merged_metrics

    def condense_metrics(self, metrics_to_check):
        """
        Checks if the value of metrics is the same, and removes redundant
        metrics while updating the duration. If there's a gap in the reported
        metrics then don't count that as part of duration.

        This is where we will need to ignore certain hours we don't bill for. That mechanism doesn't exist as of now.
        """
        return condensed_dict

The data I get from MetricsProcessor is now organized and each pod has the duration it ran for. Although I still need to calculate the SU type for these pods before an invoice can be generated. I am thinking about doing it in MetricsProcessor.

After that I could reuse good chunk of the code here: https://github.com/CCI-MOC/openstack-billing-from-db/blob/main/src/openstack_billing_db/billing.py

I would also have to add something similar to generate a pod report. And ultimately, a main.py that can run two ways: 1. to collect metrics and 2. to generate reports.

knikolla commented 1 week ago

@naved001 That seems pretty good for a start!

One comment: Don't put mutable objects in default arguments :)

>>> 
>>> class MetricsProcessor(object):
...    def __init__(self, merged_metrics: dict = {}, interval_minutes: int = 15):
...        self.merged_metrics = merged_metrics
...        self.interval_minutes = interval_minutes
... 
>>> a = MetricsProcessor()
>>> b = MetricsProcessor()
>>>
>>> a.merged_metrics is b.merged_metrics
True
>>> a.merged_metrics.update({1: 1})
>>> b.merged_metrics
{1: 1}
naved001 commented 1 week ago

Don't put mutable objects in default arguments :)

Thanks for pointing it out, that would be terrible.

joachimweyl commented 1 week ago

@naved001 this looks great, what are the next steps?

naved001 commented 1 week ago

@joachimweyl I will start the refactoring incrementally with multiple small PRs.

joachimweyl commented 1 week ago

This issue sounds like it is only interested in the creation of the blueprints. shall we close this and open individual issues for the incremental steps?

naved001 commented 4 days ago

@joachimweyl correct. I will do that.