Open malachig opened 2 years ago
In order to test this idea in its simplest form I created an example monitor script and tested it on an active google instance that was running a compute intensive step. https://github.com/griffithlab/cloud-workflows/blob/main/scripts/monitor.sh
I manually logged into the GCP instance using the Google console to test it.
To test on a cromwell run I am attempting the following:
I placed this script in our public google bucket: gs://griffith-lab-workflow-inputs/scripts/monitor.sh
I started a cromwell VM and edited the workflow options config file on this system: sudo vim /shared/cromwell/workflow_options.json
. I added the following block to that (at the top level, not nested in another block):
"monitoring_script": "gs://griffith-lab-workflow-inputs/scripts/monitor.sh"
According to the Cromwell docs, if you modify this conf file you do NOT need to restart Cromwell. These settings should take effect with the next workflow you run.
https://cromwell.readthedocs.io/en/stable/wf_options/Overview/
However, if you DID need to restart Cromwell, based on the startup script (https://github.com/griffithlab/cloud-workflows/blob/main/manual-workflows/server_startup.py) I think you could do: sudo systemctl start cromwell
If the my testing works as expected and we want to add this so it happens automatically, then I think it would be added here: https://github.com/griffithlab/cloud-workflows/blob/3822d66e6a0423ade093f48f9c2535b07adfbb6a/manual-workflows/resources.sh#L135-L143
In my first test I looked in a gcs_localization.sh script for an individual task and I now see this:
# Localize singleton file 'gs://griffith-lab-workflow-inputs/scripts/monitor.sh' to '/cromwell_root/monitoring.sh'.
singleton_file_to_localize_573998f91cb96365bcb9696ac6baf714=(
"griffith-lab"
"3"
"gs://griffith-lab-workflow-inputs/scripts/monitor.sh"
"/cromwell_root/monitoring.sh"
)
localize_singleton_file "${singleton_file_to_localize_573998f91cb96365bcb9696ac6baf714[@]}"
And I see output like this (saved in the bucket as: monitoring.log) in a step that completed very quickly:
Seconds Memory_Percent Memory_Percent_Peak Memory_GB Memory_GB_Peak Disk_Percent Disk_Percent_Peak Disk_GB Disk_GB_Peak CPU_Percent CPU_Percent_Peak
0 8.86 8.86 0.34 0.34 23.00 23.00 7.43 7.43 2.29 2.29
This seems to be working as expected. To activate monitoring one can simply add this to /shared/cromwell/workflow_options.json
on the head Cromwell VM:
"monitoring_script": "gs://griffith-lab-workflow-inputs/scripts/monitor.sh"
Results for each task appear in the Google Bucket for each task result in a file named: monitoring.log
The Cromwell docs describe the capability to have monitoring for every step of your workflow. The docs I have been able to find are limited:
https://cromwell.readthedocs.io/en/stable/wf_options/Google/ Which states:
https://cromwell.readthedocs.io/en/latest/backends/Google/ Which states: