ucsc-cgp / cloud-billing-report

Generates a summary billing report for various UCSC-CGP cloud accounts.
5 stars 4 forks source link

Generate AWS report from BigQuery #33

Open natanlao opened 3 years ago

natanlao commented 3 years ago

From #26.

AWS billing data should be loaded into BigQuery and queried in a fashion similar to how GCP billing data is queried for the GCP report.

natanlao commented 3 years ago

The current CSV delivery format may not be suitable for this due to lack of a unique ID per row (see #30). New approach is to add a new delivery format (parquet compression for Athena import), import the new files to BigQuery, then use a SQL view to mimic the GCP billing report table.

natanlao commented 3 years ago

I was not able to complete this ticket. There is currently a CUR being generated in Parquet format in AWS; I was able to connect that bucket to a BigQuery Data Transfer job, but it complains about the format of the report. The Athena export format may not be appropriate.

Note that documentation on this was mistakenly committed before the functionality was: https://github.com/ucsc-cgp/cloud-billing-report/commit/7d1c0993e7df8cbcc5d9d733a2453a373e804139#diff-b335630551682c19a781afebcf4d07bf978fb1f8ac04c6bf87428ed5106870f5R19