ComplianceAsCode / auditree-framework

The Auditree framework tool to run compliance control checks as unit tests.
https://auditree.github.io/
Apache License 2.0
59 stars 23 forks source link

Add support to parameterized fetchers #145

Closed cletomartin closed 1 year ago

cletomartin commented 1 year ago

Overview

There are many examples where a user needs to pull data from a service in which they have multiple accounts. For example, the list nodes across multiple accounts of a service provider. This is the typical use case and it would be great that one could write one single fetcher as a template method and there will be multiple executions of this fetcher but with different parameters.

Requirements

Approach

I think we can use parameterized library for this. For example

def _get_orgs():
    return get_config().get('org.gh.orgs')

class GHMembers(ComplianceFetcher):
    @classmethod
    def setUpClass(cls):
        cls.evidence = {
            org: RawEvidence(
                f'{org}_members.json',
                'gh',
                DAY,
                f'GH members of org {org}'
            )
            for org in _get_orgs()
        }
        cls.config.add_evidences(list(cls.evidence.values()))
        headers = {'Accept': 'application/json'}
        cls.session('https://api.github.com/orgs', **headers)
        return cls

    @parameterized.expand(_get_orgs)
    def fetch_github_members(self, org):
        evidence = self.evidence[org]
        if self.locker.validate(evidence):
            return

        resp = self.session().get(f'/{org}/members', params={'page': 1})
        resp.raise_for_status()
        evidence.set_content(json.dumps(resp.json(), indent=2))
        self.locker.add_evidence(evidence)

This generates an output like this:

INFO: Using locker found in /tmp/compliance...

Fetcher Primary Run

fetch_github_members_0_IBM (demo_examples.fetchers.fetch_gh_orgs.GHMembers)
"Fetch the GH members [with org='IBM']. ... ok
fetch_github_members_0_IBM (demo_examples.fetchers.fetch_gh_orgs.GHMembers)
"Fetch the GH members [with org='IBM']. - ran in: 0.001s
fetch_github_members_1_EnterpriseDB (demo_examples.fetchers.fetch_gh_orgs.GHMembers)
"Fetch the GH members [with org='EnterpriseDB']. ... ok
fetch_github_members_1_EnterpriseDB (demo_examples.fetchers.fetch_gh_orgs.GHMembers)
"Fetch the GH members [with org='EnterpriseDB']. - ran in: 0.000s

Still to decide if there is an improvement for this or this is good enough. I have been thinking on a @store_param_raw_evidence decorator but it really gets complicated at that point.

Test Plan

TBD

smithsz commented 1 year ago

Nice idea. We have some custom decorators that call fetchers/checks repeatedly with different configurations. Having a generic way to achieve this in the documentation would be great 👍

cletomartin commented 1 year ago

Made #146 as an example. This is getting some members of 2 orgs in GitHub. I have realised that the example would be much great if we could use @parameterized_class so we could even generate a report per organisation, which would be great.

I believe it is possible to provide support for it but we would need to accept wildcards like demo.CheckClass* in controls.json because parameterized modify the name of the class at runtime.