doitintl / cloud-catalog

Extract categories and services (as unified JSON) for major public cloud services.
11 stars 1 forks source link
aws azure gcp google-cloud

Public Cloud Services

Unfortunately, all cloud vendors do not provide a friendly API to list all public cloud services and categories, as listed on AWS Products, GCP Products and Azure Services pages.

The idea is to have a unified JSON schema for all cloud services.

{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "type": "array",
  "items": [
    {
      "type": "object",
      "properties": {
        "id": {
          "type": "string"
        },
        "name": {
          "type": "string"
        },
        "summary": {
          "type": "string"
        },
        "url": {
          "type": "string"
        },
        "categories": {
          "type": "array",
          "items": [
            {
              "type": "object",
              "properties": {
                "id": {
                  "type": "string"
                },
                "name": {
                  "type": "string"
                }
              },
              "required": [
                "id",
                "name"
              ]
            }
          ]
        },
        "tags": {
          "type": "array",
          "items": [
            {
              "type": "string"
            }
          ]
        }
      },
      "required": [
        "id",
        "name",
        "summary",
        "url",
        "categories",
        "tags"
      ]
    }
  ]
}

Scraping AWS Cloud Services

The AWS Products page uses undocumented https://aws.amazon.com/api/dirs/items/search endpoint to fetch paged JSON records for available cloud products.

# download AWS service JSON file and generate data/aws.json
pip install -r requirements.txt
python discovery/aws.py > data/aws.json

Scraping GCP Cloud Services

The GCP Products page is rendered on the server side and all data is embedded into the web page.

# scrap GCP Products page to get all services and generate data/gcp.json
pip install -r requirements.txt
python discovery/gcp.py > data/gcp.json

Scraping Azure Cloud Services

The Azure Services page is rendered on the server side and all data is embedded into the web page.

# scrap Azure Services page to get all services and generate data/azure.json 
pip install -r requirements.txt
python discovery/azure.py > data/azure.json

Microsoft365 Services

Edit the ms365.json file. Use data from this page.

Scraping Google Workspace Services (GSuite)

The page page contains all Google Workspace services.

# scrap Google Workspace page to get all services and generate data/gsuite.json
pip install -r requirements.txt
python discovery/gsuite.py > data/gsuite.json

CMP Services

Edit the cmp.json file. Use the CMP UI and documentation.

Credits

Edit the credits.json file.

Update/merge all tags

Run the tags.sh script to regenerate the tags.json file that contains all platform, category and services tags from all services.

Public static location

Upload all generated json files to the public cloud_tags Cloud Storage bucket.

Focus Areas update process

Focus Areas support specific services and categories based on this repo.
Updates to service/category mappings to Focus Areas are performed using the following process, and then updating the zenrouter-infra repo with the output.

Adding support of a Product to a Focus Area

Editing the ProductToFocusArea mapping file

Adding support of a Category to a Focus Area

Editing the CategoryToFocusArea mapping file

Build the Focus Area mapping json file

Deploying to BigQuery

Once merged into master, deploy the changes into BigQuery

# Valid ADC required to deploy this, or a configured service account
gcloud auth application-default login
git checkout master
git pull
mkdir -p build
python3 -m venv build/venv
source build/venv/bin/activate
cd focus_areas/
python -m pip install -r requirements.txt
python ./deploy_to_bq.py --build --deploy --project doit-zendesk-analysis

Deploying the changes in ZenRouter

Once merged into master, deploy the changes into ZenRouter Infra