doitintl / cloud-catalog

Extract categories and services (as unified JSON) for major public cloud services.
11 stars 1 forks source link

Initial stab at Focus Areas processing #86

Closed anorth848 closed 1 month ago

anorth848 commented 3 months ago

This PR contains the current process for updating focus area to service and category mappings as well as scripts to load the data into BQ for reporting

├── README.md                               # Updated with instructions 
├── data
│   ├── focus_areas
│   │   ├── CategoryToFocusArea.tsv         # category tags to Focus Areas (this is updated when categories need to be added/removed/updated)
│   │   ├── FocusAreas.tsv                  # List of defined focus areas
│   │   ├── ProductToFocusArea.tsv          # Product names to Focus Areas (this is updated when products need to be added/removed/updated)
│   │   ├── all.json                        # This is generated by focus_areas/build_focus_areas.py after tsv's have been updated
│   │   └── exceptions                      # This directory contains information about Products that have not been assigned 
                                            # to a Focus Area or invalid Product Names have been provided in ProductToFocusArea.tsv
│   │       ├── aws.json
│   │       ├── google_cloud.json
│   │       ├── google_workspace.json
│   │       ├── microsoft_azure.json
│   │       └── microsoft_office_365.json
├── focus_areas
│   ├── build_focus_areas.py                # Builds data/focus_areas/all.json
│   ├── deploy_to_bq.py                     # deploys focus_area data to BQ for reporting
│   ├── generate_hcl.py                     # helper script to generate HCL for zenrouter-infra/datastore.tf
│   ├── requirements.txt
│   └── sql_statements.py                   # Used by build_focus_areas.py, contains SparkSQL 
anorth848 commented 2 months ago

This is more or less ready for review. Some of the scripts/files may disappear in the future, but for now this is what we are using for go-live.

anorth848 commented 1 month ago

@alexei-led @caddac Now that Focus Areas are deployed, for sure we should get this merged. We will create a backlog task to revamp this process in the future, but at least it is documented and we can follow it to update focus areas if needed in the interim.