GoogleCloudPlatform / pubsec-declarative-toolkit

The GCP PubSec Declarative Toolkit is a collection of declarative solutions to help you on your Journey to Google Cloud. Solutions are designed using Config Connector and deployed using Config Controller.
Apache License 2.0
31 stars 28 forks source link

FinOps: Automated BigQuery billing export example with nested/composite/normalized records #292

Open fmichaelobrien opened 1 year ago

fmichaelobrien commented 1 year ago

See: https://github.com/GoogleCloudPlatform/pbmm-on-gcp-onboarding/blob/main/docs/google-cloud-onboarding.md#billing

Document and code a canary for billing export

Billing | Billing export | BigQuery Export

https://cloud.google.com/billing/docs/how-to/export-data-bigquery

Screen Shot 2023-02-27 at 2 14 44 PM
obriensystems commented 1 year ago

steps https://cloud.google.com/billing/docs/how-to/export-data-bigquery-setup

michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ export PROJECT_ID=billing-export-old
michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ export BOOT_PROJECT_ID=docproc-hc-old
michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ export BILLING_ID=$(gcloud alpha billing projects describe $BOOT_PROJECT_ID '--format=value(billingAccountName)' | sed 's/.*\///')
michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ echo $BILLING_ID
01E6E8-A....9-D21FF3
michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ export ORG_ID=$(gcloud projects get-ancestors $BOOT_PROJECT_ID --format='get(id)' | tail -1)
michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ echo $ORG_ID
583...7868

michael@cloudshell:~/docproc-hc-old/dev-csr (docproc-hc-old)$ gcloud projects create $PROJECT_ID --name="${PROJECT_ID}" --set-as-default
Create in progress for [https://cloudresourcemanager.googleapis.com/v1/projects/billing-export-old].
Waiting for [operations/cp.5073782008204155902] to finish...done.    
Enabling service [cloudapis.googleapis.com] on project [billing-export-old]...
Operation "operations/acat.p2-656637240499-5a4f32fd-385b-474c-8965-55978a8716e0" finished successfully.
Updated property [core/project] to [billing-export-old].
michael@cloudshell:~/docproc-hc-old/dev-csr (billing-export-old)$

michael@cloudshell:~/docproc-hc-old/dev-csr (billing-export-old)$ gcloud services list --enabled --project $PROJECT_ID | grep NAME
NAME: bigquery.googleapis.com
NAME: bigquerymigration.googleapis.com
NAME: bigquerystorage.googleapis.com
NAME: cloudapis.googleapis.com
NAME: clouddebugger.googleapis.com
NAME: cloudtrace.googleapis.com
NAME: datastore.googleapis.com
NAME: logging.googleapis.com
NAME: monitoring.googleapis.com
NAME: servicemanagement.googleapis.com
NAME: serviceusage.googleapis.com
NAME: sql-component.googleapis.com
NAME: storage-api.googleapis.com
NAME: storage-component.googleapis.com
NAME: storage.googleapis.com

michael@cloudshell:~/docproc-hc-old/dev-csr (billing-export-old)$ gcloud beta billing projects link ${PROJECT_ID} --billing-account ${BILLING_ID}

Verified Billing Account Administrator on the target billing account

gcloud projects add-iam-policy-binding $PROJECT_ID  --member=user:$USER_EMAIL --role=roles/bigquery.dataOwner --quiet > /dev/null 1>&1
gcloud projects add-iam-policy-binding $PROJECT_ID  --member=user:$USER_EMAIL --role=roles/bigquery.admin --quiet > /dev/null 1>&1
gcloud projects add-iam-policy-binding $PROJECT_ID  --member=user:$USER_EMAIL --role=roles/editor --quiet > /dev/null 1>&1

gcloud services enable bigquerydatatransfer.googleapis.com

michael@cloudshell:~/docproc-hc-old/dev-csr (billing-export-old)$ export REGION=northamerica-northeast1
michael@cloudshell:~/docproc-hc-old/dev-csr (billing-export-old)$ export DATASET=billing_ds
michael@cloudshell:~/docproc-hc-old/dev-csr (billing-export-old)$ bq --location $REGION mk --dataset ${PROJECT_ID}:${DATASET}
Dataset 'billing-export-old:billing_ds' successfully created.

default google managed keys
Screenshot 2023-02-27 at 22 44 54

billing export creation https://console.cloud.google.com/billing/export?_ga=2.119642696.1593990615.1677548807-1098396564.1647194753

Screenshot 2023-02-27 at 22 49 28
obriensystems commented 1 year ago

Enable pricing data

Screenshot 2023-02-27 at 22 53 39 Screenshot 2023-02-27 at 22 54 03

all 3

Screenshot 2023-02-27 at 22 55 28

we need to wait up to 24 hours for tables to appear under the dataset - looking for on demand....

obriensystems commented 1 year ago

https://cloud.google.com/billing/docs/how-to/export-data-bigquery-tables#standard-usage-cost-data-schema "Shortly after enabling Cloud Billing export to BigQuery, billing data tables are automatically created in the BigQuery dataset."

Standard and detailed usage cost export:

When you first enable the standard or detailed usage cost export to BigQuery, it might take a few hours to start seeing your Google Cloud cost data.

obriensystems commented 1 year ago

alternative:

Cost Table https://console.cloud.google.com/billing/019283-6F1AB5-7AD576/reports/tabular;credits=NONE;invoiceCorrections=TAX,BILLING_MODIFICATION?organizationId=583675367868

Screenshot 2023-02-27 at 23 02 50 Screenshot 2023-02-27 at 23 02 11
obriensystems commented 1 year ago

Tables generated overnight - exporting to json

https://cloud.google.com/billing/docs/how-to/export-data-bigquery-tables/standard-usage

Screenshot 2023-02-28 at 07 50 32 Screenshot 2023-02-28 at 07 52 50 Screenshot 2023-02-28 at 07 53 53 Screenshot 2023-02-28 at 07 55 01
{"billing_account_id":"019283-6...-7AD576","service":{"id":"CAE2-A537-4A95","description":"Source Repository"},"sku":{"id":"978B-7150-4C95","description":"Storage"},"usage_start_time":"2023-02-27 19:00:00 UTC","usage_end_time":"2023-02-27 20:00:00 UTC","project":{"id":"traffic-agz","number":"783080225319","name":"traffic-agz","labels":[],"ancestry_numbers":"/6839210352/64965792995/","ancestors":[{"resource_name":"projects/783080225319","display_name":"traffic-agz"},{"resource_name":"folders/64965792995","display_name":"traffic"},{"resource_name":"organizations/68...352","display_name":"alternate.gcp.zone"}]},"labels":[],"system_labels":[],"location":{"location":"global"},"tags":[],"export_time":"2023-02-28 05:48:13.935865 UTC","cost":0,"currency":"CAD","currency_conversion_rate":1.3327499999991479,"usage":{"amount":240559200,"unit":"byte-seconds","amount_in_pricing_units":9.2608388513326645e-08,"pricing_unit":"gibibyte month"},"credits":[],"invoice":{"month":"202302"},"cost_type":"regular"}
fmichaelobrien commented 1 year ago

Now that we have a JSON exportable output in GCS that we can pipeline - we will examine the export schema for various reporting packages

obriensystems commented 1 year ago

Issue: CSV is a flat data format - without some agreed interfield escaping of nested fields JSON/XML are built for tree based schemas thanks Claudia for the links

Anyway will raise a issue on the fact the error does not make it back from the async call - client needs to check logging to look for the failure and the fact the export does not appear in GCS

Todo: check mix of json/csv export

use case: bq connector to power bi/looker Issue with nested fields (service, project) - only for CSV export - not for JSON (Working fine)

Screen Shot 2023-02-28 at 10 40 59 AM
cartyc commented 1 year ago

Most of this I think can be automated via IaC, pre-creating the tables and enabling the required APIs, but the only missing piece is enabling the actual billing export which will need to be done manually. See this issue from the TF provider https://github.com/hashicorp/terraform-provider-google/issues/4848 .

fmichaelobrien commented 1 year ago

Good plan and issue ref - I am planning on KRMing everything - so far prototyping automation via gcloud script The 3 standard/default/pricing export initializations were manual so far - checking the rest calls Compliance issue would also be around the timing for the tables to be auto-created/populated on the big query dataset 3+, ~24, ~48 for standard/detailed/pricing Verification would need to be async

obriensystems commented 1 year ago

Workaround for ORM eager (non-lazy) loaded query results that include FK table populations in CSV must be partial or full JSON -this is done via query not export

Example

record
  +-- id
  +-- system
  +-- project
      +-- id
      +-- name
fmichaelobrien commented 1 year ago

The following is a workaround for nested fields like service or project. Run a standard query against the table and export the results as CSV. JSON will be exported for nested fields containing FK normalized data

Ref:272009024

Screen Shot 2023-03-07 at 11 51 27 AM