doitintl / bigquery-optimization-queries

Queries to assist with BigQuery cost and performance.
MIT License
72 stars 31 forks source link

BigQuery slot computation: double counting #4

Open sourygna-cts opened 1 year ago

sourygna-cts commented 1 year ago

When doing some aggregations to compute the amount of slots per unit of time (like in the queries: query_slots_per_second.sql & slots_by_second.sql), then the following filter should be added in the WHERE condition:

statement_type != 'SCRIPT'

Otherwise, the amount of slots will be counted twice in some situations. This is mentioned in this doc https://cloud.google.com/bigquery/docs/information-schema-jobs-timeline :

For script jobs, the parent job also reports the total slot usage from its children jobs. To avoid double counting, use WHERE statement_type != "SCRIPT" to exclude the parent job.