GoogleCloudPlatform / bigquery-antipattern-recognition

Utility to identify and rewrite common anti patterns in BigQuery SQL syntax
Apache License 2.0
76 stars 26 forks source link

Analyzing information_schema based on window: startTime and endTime #31

Closed sridivakar closed 11 months ago

sridivakar commented 11 months ago

with these command line options:

--read_from_info_schema 
--read_from_info_schema_start_time "2023-08-15 09:00:00" 
--read_from_info_schema_end_time "2023-08-16"
--read_from_info_schema_timeout_in_secs 60  
--info_schema_table_name `region-us`.INFORMATION_SCHEMA.JOBS 
--processing_project_id <my-project>
--output_table <my-project>.<my-dataset>.antipattern_output_table

Query on INFORMATION SCHEMA:

SELECT
  project_id,
  CONCAT(project_id, ":US.",  job_id) job_id, 
  query, 
  total_slot_ms / (1000 * 60 * 60 ) AS slot_hours, 
  user_email 
FROM
`region-us`.INFORMATION_SCHEMA.JOBS
WHERE 
  creation_time BETWEEN '2023-08-15 09:00:00' AND '2023-08-16'
  AND total_slot_ms > 0
  AND (statement_type != "SCRIPT" OR statement_type IS NULL)
  AND (reservation_id != 'default-pipeline' or reservation_id IS NULL)
  AND query not like '%INFORMATION_SCHEMA%' 
ORDER BY 
  project_id, start_time desc
franklinWhaite commented 11 months ago

LGTM, I only left one comment on the timeout, once that is addressed we can merge

sridivakar commented 11 months ago

LGTM, I only left one comment on the timeout, once that is addressed we can merge

Thanks, addressed the comment.