aws-samples / experimental-programmatic-access-ccft

Experimental Programmatic Access to the AWS Customer Carbon Footprint Tool data
MIT No Attribution
28 stars 8 forks source link

Experimental programmatic access to the AWS Customer Carbon Footprint Tool data

You can use the AWS Customer Carbon Footprint Tool (CCFT) to view estimates of the carbon emissions associated with your AWS products and services. You can access the same AWS Customer Carbon Footprint Tool information by resembling the behavior of the console with this experimental script.

The script can be used for programmatic access to the same AWS Customer Carbon Footprint Tool data the browser has access to. It enables customers to do two things:

  1. Programmatic access to feasibly get individual estimates of hundreds or thousands of accounts without logging in to each account manually.
  2. Lowered carbon reporting threshold to kilogram level (three decimal digits) as introduced in the CSV file download feature.

This repository gives you supporting source code for two use cases:

  1. If you are looking for a way to extract CCFT data for a small number of accounts on an ad-hoc basis, or want to include the script within your application, you can find the ccft_access.py script itself in the MultiAccountApplication/lambda_functions/extract_carbon_emissions/ folder. To get started, check out the General FAQs and the single-account specific FAQs below.

  2. If you are looking for a way to automate the monthly extraction of new CCFT data within a multi account structure, this repository contains source code and supporting files for a serverless application that you can deploy with the SAM CLI or via the Serverless Application Repository. With it, you can deploy an application to extract new AWS Customer Carbon Footprint Tool data every month for all accounts of your AWS organization with the experimental script. You can find the supporting source code within the folder MultiAccountApplication. To get started, check out the General FAQs and the multi-account specific FAQs below.

Read the AWS Customer Carbon Footprint Tool documentation for more details to understand your carbon emission estimations.

Security

See CONTRIBUTING for more information.

License

This library is licensed under the MIT-0 License. See the LICENSE file.

FAQ

General FAQ

Q: What does experimental mean?

This script resembles the access to CCFT data from the AWS Billing Console. Hence it is not using an official AWS interface and might change at any time without notice and just stop working.

Q: How does the data relate to what I see in the AWS Customer Carbon Footprint Tool?

On a high-level, the output from calling the experimental programmatic access script looks like the following. See the respective numbering in the screenshot of the AWS Customer Carbon Footprint Tool below to understand where you can find the respective information.

{
  "accountId": "████████████",
  "query": {
    "queryDate": <-- date when query was executed
    "startDate": <-- START_DATE of query
    "endDate": <-- END_DATE of query
  },
  "emissions": {
    "carbonEmissionEntries": [
      {
        "mbmCarbon": <-- (1), Your estimated carbon emissions in metric tons of CO2eq, following the market-based method (mbm) of the Greenhouse Gas Protocol 
        "paceProductCode": <-- (2), Your emissions by service
        "regionCode": <-- (3), Your emissions by geography
        "startDate": <-- month this data relates to
      },
      {
        […]
      }
    ],
    "carbonEmissionsForecast": <-- (5), Path to 100% renewable energy
      {
        […]
        "mbmCarbon": <-- Your estimated, forecasted carbon emissions in metric tons of CO2eq, following the market-based method (mbm) of the Greenhouse Gas Protocol
        "startDate": <-- year this data relates to
      },
      {
        […]
      }
    ],
    "carbonEmissionsInefficiency": <-- (4)
      {
        "gridMixInefficiency": <-- (4.1), Your emission savings from AWS renewable energy purchases
        […]
        "serverMedianInefficiency": <-- (4.2), Your emission savings from using AWS computing services
        "startDate": <-- month this data relates to
      },
      {
        […]
      }
    ]      
  […]

Console Reference

If your AWS Customer Carbon Footprint Tool emissions are zero, the script will also return 0.0. Please note, that you will not see the product split or region split in this case (paceProductCode and regionCode under carbonEmissionEntries will not be returned).

Read the AWS Customer Carbon Footprint Tool documentation for more details to understand your carbon emission estimations.

Single-account script FAQ

Q: How do I use the script?

  1. Clone the repository and navigate to the folder MultiAccountApplication/lambda_functions/extract_carbon_emissions/.
  2. Assume a role with access to the AWS Customer Carbon Footprint Tool.
  3. Execute the script:
python ccft_access.py
{
    "accountId": "████████████",
    "query": {
        "queryDate": "2023-02-12", "startDate": "2020-01-01", "endDate": "2023-01-01"
    },
    "emissions": {
        "carbonEmissionEntries": [
            {
                "mbmCarbon": "0.048", "paceProductCode": "Other", "regionCode": "EMEA", "startDate": "2020-01-01"
            },
[…]

Q: What AWS IAM role do I need?

Use a role with the following AWS IAM policy that contains the AWS Customer Carbon Footprint Tool IAM permission:

{   
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sustainability:GetCarbonFootprintSummary",
            "Resource": "*"
        }
    ]
}

Q: What python packages do I need?

You will need the python requests and boto3 package. You can install it like this:

python -m pip install requests boto3

Q: For what timeframe is data extracted?

New carbon emissions data is available monthly, with a delay of three months as AWS gathers and processes the data that's required to provide your carbon emissions estimates. By default, the script extracts data starting from 39 months ago until three months before the current month.

Example: When you are running the script in July 2023, the script extracts carbon emissions data from April 2020 to April 2023. (start_date: 2020-04-01, end_date: 2023-04-01)

Q: How can I change the queried timeframe?

Execute python ccft_access.py -h for help how the default interval can be changed.

python ccft_access.py -h
usage: ccft_access.py [-h] [--start-date START_DATE] [--end-date END_DATE]

Experimental retrieval of AWS Customer Carbon Footprint Tool console data. The data
is queried for a closed interval from START_DATE to END_DATE (YYYY-MM-DD). The queried timeframe
must be less than 36 months and not before 2020-01-01.

optional arguments:
  -h, --help            show this help message and exit
  --start-date START_DATE, -s START_DATE
                        first month of the closed interval, default: 36 months before end month
  --end-date END_DATE, -e END_DATE
                        last month of the closed interval, default: 3 months before current month

Q: How can I get the output prettyprinted?

You can use jq to prettyprint the JSON output. jq is a lightweight and flexible command-line JSON processor. If you use pip use pip install jq to install it.

python ccft_access.py | jq .
{
  "accountId": "████████████",
  "query": {
    "queryDate": "2023-02-12",
    "startDate": "2020-01-01",
    "endDate": "2023-01-01"
  },
  "emissions": {
    "carbonEmissionEntries": [
      {
        "mbmCarbon": "0.048",
        "paceProductCode": "Other",
        "regionCode": "EMEA",
        "startDate": "2020-01-01"
      },
[…]

Q: How do I get the data as a CSV?

You can extend the use of jq in the previous question to transform the JSON output to a CSV file.

python ccft_access.py | \
    jq -r '{accountId} as $account |
        .emissions.carbonEmissionEntries |
        map(. + $account ) |
        (map(keys) | add | unique) as $cols |
        map(. as $row | $cols | map($row[.])) as $rows |
        $cols, $rows[] | @csv' > ccft-data.csv

head ccft-data.csv
"accountId","mbmCarbon","paceProductCode","regionCode","startDate"
"████████████","0.048","Other","EMEA","2020-01-01"
[…]

Multi-Account extraction FAQ

Q: What does the application do on a high level?

alt text

The application does the following on a high level:

Q: What resources am I deploying?

This SAM template deploys the following resources:

You can find details on the resources that are created within the template.yaml file.

Q: What does the state machine do?

alt text

Q: How can I deploy the application?

You can deploy the application via the Serverless Application Repository or with the SAM CLI.

Option 1: Deployment via the AWS Serverless Application Repository

The AWS Serverless Application Repository is a managed repository for serverless applications. Using the Serverless Application Repository (SAR), you don't need to clone, build, package, or publish source code to AWS before deploying it. To deploy the application, go to the Experimental Programmatic Access application.

cloudformation-launch-button

In the AWS Management console, you can view the application's permissions and resources, and configure the application in the Application settings section.

Option 2: Deployment with the SAM CLI

The Serverless Application Model Command Line Interface (SAM CLI) is an extension of the AWS CLI that adds functionality for building and testing Lambda applications.

To use the SAM CLI, you need the following tools.

(1) Clone the repository.

(2) Navigate to the cloned repository and the folder MultiAccountApplication, so that you are in the folder that includes the template.yaml file. To build and deploy your application for the first time, run the following commands:

sam build
sam deploy --guided --profile <specify your profile name as in your credentials file or use default>

The first command will build the source of your application. If you get an error message related to Python 3.11 runtime dependencies, you can also use sam build --use-container to build your serverless application's code in a Docker container.

The second command will package and deploy your application to AWS, with a series of prompts:

Confirm changeset to be deployed and wait until all resources are deployed. You will see a success message such as Successfully created/updated stack - ccft-sam-script in eu-central-1.

(3) Within the AWS console, navigate to the CloudFormation dashboard. Here, you will see the stack that was just deployed. You can navigate to the Resources tab to see all resources that were created as part of this stack.

(4) The state machine will automatically be triggered on the next 15th. If you want to run the application already now, you can also navigate to your ExtractCarbonEmissionsStateMachine Step Functions State Machine. Select Start execution. You can leave everything as is, and select Start execution.

Q: Into which account should I deploy this application?

You can run this from any account within your organization, as long as you set up the necessary permissions.

Q: What permissions are needed to run this?

In order to successfully extract carbon emissions data from the central account for all child accounts, follow these steps:

(1) Deploy an IAM role named ccft-read-role with the following AWS IAM policy that contains the AWS Customer Carbon Footprint Tool IAM permission into all child accounts. To do this for all accounts of your AWS organizations, there are several options which are explained in Q: How can I deploy IAM roles into multiple accounts

{   
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": "sustainability:GetCarbonFootprintSummary",
            "Resource": "*"
        }
    ]
}

(2) Additionally, you need to set-up a trust relationship so that the Lambda function extract-carbon-emissions-data in the central account (where you've deployed the SAM application) can assume this role. Update {Account} with the account ID of the account where you've deployed this application.

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Principal": {
                "AWS": "arn:aws:iam::{Account}:role/extract-emissions-lambda-role"
            },
            "Action": "sts:AssumeRole",
            "Condition": {}
        }
    ]
}

(3) Optional: if you have given the IAM role a different name, you can change the parameter CCFTRoleName when deploying the SAM application. Make sure that all roles within all child accounts have the same name.

(4) The get-account-ids.py Lambda function calls the AWS Organization's ListAccounts API. This operation can be called only from the organization's management account or by a member account that is a delegated administrator for an AWS service. You can set up a delegation policy which allows the account where you're deploying this solution to call the ListAccounts API.

Q: How can I deploy IAM roles into multiple AWS accounts?

Depending on your AWS Organization set-up, there are several ways to achieve having the same IAM role with a trust relationship deployed into all accounts of your AWS organization.

(1) In an AWS Control Tower environment, use the Customization for Control Tower (CfCT) CI/CD pipeline to deploy the CCFT read-only IAM role for existing and new accounts. A sample manifest and CFn template are included in the ccft-org-read-role folder in this repository.

(2) Using AWS CloudFormation StackSets to deploy IAM roles to existing and future accounts. Check out the blog post Use AWS CloudFormation StackSets for Multiple Accounts in an AWS Organization for more details. A sample template (ccft-read-only.yaml) for the role is included in the ccft-org-read-role/ccft-role folder in this repository.

Q: Can I change the queried timeframe?

The application extracts data for three months past the month it is running. Example: The application extracts data for April 2023 when it runs in July 2023.

You can override the timeframe when you manually start the Step Functions workflow.

Q: What can I do with the data?

As a result of a successful run through the state machine, new emissions data from the AWS Customer Carbon Footprint Tool will be available monthly in the S3 bucket {AccountId}-{Region}-ccft-data in .json format.

To make it easier to use this data, the state machine also creates an Athena table and two views to directly query the data with SQL. Navigate to the Amazon Athena console and select the database carbon_emissions. You should see a table named carbon_emissions_table and two views called carbon_emissions_view and carbon_emissions_aggregate_view.

carbon_emissions_view shows the monthly carbon emissions per account ID, product code and region code. carbon_emissions_aggregate_view shows the aggregated carbon emissions per account ID per month, and the respective entries for gridmixinefficiency and servermedianinefficiency.

Athena Console Screenshot

If you select Preview Table using the three dots next to your table, you can see the nested json data with one row per json file. You can also view the statement used to create the table by selecting Generate table DDL using the three dots next to your table.

Next, select Preview view using the three dots next to your view. When you select Show/edit query you can also see and modify the query to create the view. The view includes the unnested data, with one row per account and month data. You can use SQL statements to directly query the data. If you want to find all emissions for a specific month, you can for example use the following statement:

SELECT * FROM carbon_emissions_view WHERE startdate = '2022-01-01';

If you want to visualize the data, you can do so by using Amazon QuickSight. Check out the following documentation entry to understand how you can create a QuickSight dataset using Amazon Athena data.

Other things to consider

Troubleshooting

What are costs for running this application?

The cost for running this application depends on the number of accounts that are part of your organization. You can use this AWS Pricing Calculator example and adapt it to your requirements. There are no upfront cost to running this application; you pay for the resources that are created and used as part of the application. Major services used are the following:

Q: How can I specify the account ids to retrieve data from?

By default the application retrieves data from all accounts of the AWS Organization with the payer as the canary account. If you want to override the account ids you can add them to the scheduling event in the template:

ComplexScheduleEvent:
  Type: ScheduleV2
  Properties:
    ScheduleExpression: "cron(0 0 15 * ? *)"
    Input: "{\"override_accounts\": [\"YOUR-ACCOUNT-ID\", \"YOUR-OTHER-ID\"]}" # add this line
    FlexibleTimeWindow:
      Mode: FLEXIBLE
      MaximumWindowInMinutes: 60

The first account in the list will be used as the canary account.

Q: Cleanup - How can I delete the application?

To delete the sample application that you created, you can use the AWS CLI. Make sure to empty the buckets that were created as part of this application before you run the following command. Assuming you used ccft-sam-script for the stack name, you can run the following:

aws cloudformation delete-stack --stack-name ccft-sam-script