[x] a unique way of naming files so it is clear which one belongs to which process
-- grabbing the last 5 "/"-dilemeted fields seems unique enough + creating /full_path to include full paths if needed
[x] iterate through the bucket dir to grab the files
-- downloading folders locally is inefficient. Each folder is +100GB.
-- instead I am only grabbing the files I want after generating their paths through gsutil ls gs://BUCKET/.../**/monitoring.log
-- copying files over takes about 15 min
[x] automatically copy pull_monitor_logs.sh script to /shared
[x] add summary.log
example cmd: bash /shared/pull_monitor_logs.sh --gs-path griffith-lab-test-layth/cromwell-executions/immuno --wf-id b6ef294d-080b-41cb-924e-ea53f6c54a2a
[x] add a script to collect all target files
[x] a unique way of naming files so it is clear which one belongs to which process -- grabbing the last 5 "/"-dilemeted fields seems unique enough + creating
/full_path
to include full paths if needed[x] iterate through the bucket dir to grab the files -- downloading folders locally is inefficient. Each folder is +100GB. -- instead I am only grabbing the files I want after generating their paths through
gsutil ls gs://BUCKET/.../**/monitoring.log
-- copying files over takes about 15 min[x] automatically copy
pull_monitor_logs.sh
script to/shared
[x] add
summary.log
example cmd:
bash /shared/pull_monitor_logs.sh --gs-path griffith-lab-test-layth/cromwell-executions/immuno --wf-id b6ef294d-080b-41cb-924e-ea53f6c54a2a