populationgenomics / cpg-infrastructure

This repository is used to manage the infrastructure at the CPG
MIT License
3 stars 1 forks source link

Aggregate changes for large batches #215

Closed milo-hyben closed 7 months ago

milo-hyben commented 7 months ago

This PR contains:

  1. Small materialised view change to be bale to see hail batch query jobs
  2. Disable monthly aggregate function to populate google doc as this has been replaced with metamist billing report.
  3. Aggregate large batches (over 9K jobs per batch - this is set by DEFAULT_MAX_JOBS_PER_BATCH), all jobs billing information gets merged into one job with the JobId of the last jobs, this help us to keep number of jobs per batch.
  4. Increase cloud function from 9mins to 1H, there is only one concurrent allowed so if Subscription triggers another after 10mins (max PubSub timeout possible for push), it would fail and wait for the first to complete.

I have tested on Feb 21-Feb 20, which took over 4 hours before to run from local machine. Much faster now. We can discuss if 9K is not too restrictive, but noticed not many non hail batch query have more than 8K of jobs.

codecov-commenter commented 7 months ago

Codecov Report

All modified and coverable lines are covered by tests :white_check_mark:

:exclamation: No coverage uploaded for pull request base (main@679d030). Click here to learn what that means.

Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #215 +/- ## ======================================= Coverage ? 90.65% ======================================= Files ? 4 Lines ? 428 Branches ? 0 ======================================= Hits ? 388 Misses ? 40 Partials ? 0 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

milo-hyben commented 7 months ago

Made a few changes, reflecting your comments:

  1. Hail Batch Query is identified now as batch with no attribute name
  2. Still cut max jobs per batch if jobs_cnt > 9K (happy to discuss if too restrictive)
  3. Added option to load batches by batch_id

Please let me know what do you think.

milo-hyben commented 7 months ago

I had added CI Jobs and for jobs with 0 cost, it adds 0 cost to job resources so it is included in the aggregate table. I had checked Oct-Dec 2023 and we are missing 1306 jobs, will slack you more details.

milo-hyben commented 7 months ago

I had resolved the comments. One last review please(can't merge without it). Thank you!