The Guidance for Sustainability Data Fabric on AWS is an opinionated sustainability lens built on top of the Guidance for Data Fabric on AWS.
These deployment instructions are intended for use on MacOS. Deployment using a different operating system may require additional steps.
AmazonDataZoneFullUserAcces
to the IAM role.git clone git@github.com:aws-solutions-library-samples/guidance-for-aws-sustainability-insights-framework.git
cd guidance-for-aws-sustainability-insights-framework
rush update --bypass-policy
rush build
cd infrastructure/platform
npm run cdk — deploy -c environment=dev -c clusterDeletionProtection=false -c includeCaml=true --all --require-approval never --concurrency=5
cd ../tenant
npm run cdk -- deploy -c tenantId=<tenantId> -c environment=<SIF_ENVIRONMENT> -c administratorEmail=<ADMIN_EMAIL> \
-c enableDeleteResource=true -c deleteBucket=true -c includeCaml=true \
-c idcEmail=<idcEmail> -c idcUserId=<idcUserId> -c dfSustainabilityRoleArn=<dfSustainabilityRoleArn> \
-c dataFabricRegion=<dataFabricRegion> -c dataFabricEventBusArn=<dataFabricEventBusArn> \
--all --require-approval never --concurrency=10
cd ../../typescript/packages/integrationTests && ``npm run generate:token -- <tenantId> <SIF_ENVIRONMENT> <ADMIN_EMAIL> 'temporary password' 'new password'
The hub and spoke accounts must be bootstrapped for the AWS Cloud Development Kit. The spoke account must be bootstrapped to trust the hub account.
cdk bootstrap <HUB_ACCOUNT_ID>/<REGION> --profile <HUB_PROFILE>
cdk bootstrap <SPOKE_ACCOUNT_ID>/<REGION> --trust <HUB_ACCOUNT_ID> --cloudformation-execution-policies=arn:aws:iam::aws:policy/AdministratorAccess --profile <SPOKE_PROFILE>
git clone git@github.com:aws-solutions-library-samples/guidance-for-sustainability-data-fabric-on-aws.git
cd guidance-for-sustainability-data-fabric-on-aws
rush update --bypass-policy
rush build
cd infrastructure
Run the following seeding step inside the infrastructure
folder.
Run the command below to seed the data fabric products and SIF pipeline resources.
npm run cdk -- deploy -c hubAccountId=<HUB_ACCOUNT_ID> \
-c spokeAccountId=<SPOKE_ACCOUNT_ID> -c domainId=<DATAZONE_DOMAIN_ID> \
-c domainName=<DATAZONE_DOMAIN_NAME> -c projectId=<DATAZONE_PROJECT_ID> \
-c athenaEnvironmentId=<DATAZONE_DATA_LAKE_ENV_ID> -c redshiftEnvironmentId=<DATAZONE_DATA_WAREHOUSE_ENV_ID> \
-c roleArn=<SERVICE_ROLE_ARN> -c environment=<SIF_ENV> -c tenantId=<SIF_TENANT_ID> \
-c sifAdminEmailAddress=<SIF_ADMIN_EMAIL> -c sdfAdminUserId=<IAM_IDENTITY_CENTER_USERNAME> \
-—require-approval never -—concurrency=10 SdfHubDemoStack SdfHubProductStack
sdf-common-hub
sdf-products-hub
sdf-demo-spoke
sdf-common-spoke
sdf-products-spoke
sdf-demo-spoke
df-data-asset
State Machine in AWS Step Functions. You should see about 30 successful executions.df-spoke-data-asset
State Machine in AWS Step Functions. You should see about 30 successful executions.df-purchased_goods_and_services_transformed
on the search bar and click enter, you should see df-purchased_goods_and_services_transformed-<generated_id>-recipejoboutput
asset. Click on the SUBSCRIBE
button.Run the command below to seed the SIF impact factor resources and start the execution of the ghg:scope_3:purchased_goods_and_services
pipeline.
npm run cdk -- deploy -c hubAccountId=<HUB_ACCOUNT_ID> \
-c spokeAccountId=<SPOKE_ACCOUNT_ID> -c domainId=<DATAZONE_DOMAIN_ID> \
-c domainName=<DATAZONE_DOMAIN_NAME> -c projectId=<DATAZONE_PROJECT_ID> \
-c athenaEnvironmentId=<DATAZONE_DATA_LAKE_ENV_ID> -c redshiftEnvironmentId=<DATAZONE_DATA_WAREHOUSE_ENV_ID> \
-c roleArn=<SERVICE_ROLE_ARN> -c environment=<SIF_ENV> -c tenantId=<SIF_TENANT_ID> \
-c sifAdminEmailAddress=<SIF_ADMIN_EMAIL> -c sdfAdminUserId=<IAM_IDENTITY_CENTER_USERNAME> \
—require-approval never —concurrency=10 SdfHubWorkflowStack
sdf-workflow-spoke
have been successfully created in the spoke account.sdf-demo-seeder
State Machine in AWS Step Functions. You should see 1 successful execution, this should create all the impact factors and trigger the ghg:scope_3:purchased_goods_and_services
pipeline.sif-demo-dev-activityPipelineSM
State Machine in AWS Step Functions. You should see 1 successful execution, at the end of the pipeline execution, it will trigger the registration of the pipeline and metric output in DataZone.df-ghg:scope_3:purchased_goods_and_services-<generated_id>-pipeline_<pipeline_id>
) and metric (df-sustainability_insight_framework_metrics-<generated_id>-metrics
) output have been published as an asset.data-lake environment
and Query data
,df-spoke-<aws_account_id>-<aws_region>
on the Database drop down box, you should see the 2 tables that you had just subscribed.df-hub-datalineage
and click on the Outputs
tab.Everything should run automatically after deployment completes. You can now navigate to the DataZone UI and explore the assets that have been created. If you don’t see all assets in the DataZone catalog, try rerunning the deployment steps.
sdf
in the hub and spoke accounts.Customers are responsible for making their own independent assessment of the information in this Guidance. This Guidance: (a) is for informational purposes only, (b) represents AWS current product offerings and practices, which are subject to change without notice, and (c) does not create any commitments or assurances from AWS and its affiliates, suppliers or licensors. AWS products or services are provided “as is” without warranties, representations, or conditions of any kind, whether express or implied. AWS responsibilities and liabilities to its customers are controlled by AWS agreements, and this Guidance is not part of, nor does it modify, any agreement between AWS and its customers.