broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
996 stars 361 forks source link

Abstract PAPI v2 Execution Events into higher groups #4162

Closed ruchim closed 6 years ago

ruchim commented 6 years ago

When using PAPI v2, the execution events for a single job are broken down into a lot of minute actions, and those actions are represented in the timing diagram. The issue is that the level of granularity provided by Cromwell's execution events is too zoomed in, and its not easy to digest. Since the operation metadata itself will hold most of this granularity, it should be useful for Cromwell to group certain actions into a higher level grouping concept based on its understanding of the job. For example, here's the execution events from a single run:

        "executionEvents": [
          {
            "startTime": "2018-08-14T16:16:40.069663Z",
            "description": "Started running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stderr gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stderr gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\"",
            "endTime": "2018-08-14T16:16:42.510303Z"
          },
          {
            "startTime": "2018-08-14T16:16:43.002063Z",
            "description": "Started running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil   cp \/cromwell_root\/rc gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing  cp \/cromwell_root\/rc gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\"",
            "endTime": "2018-08-14T16:16:45.288207Z"
          },
          {
            "startTime": "2018-08-14T16:16:57.822335Z",
            "description": "Stopped running \"\/bin\/sh -c cat \/cromwell_root\/0c83f20c\/cwl_output_json_references.txt 2>\/dev\/null | xargs -I % sh -c 'gsutil -m  cp -r % gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/$(echo % | sed -e \\\"s\/^\\\\\/\/\/\\\") 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -m -u dos-testing cp -r % gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/$(echo % | sed -e \\\"s\/^\\\\\/\/\/\\\"); fi '\"",
            "endTime": "2018-08-14T16:16:58.180303Z"
          },
          {
            "startTime": "2018-08-14T16:16:56.948463Z",
            "description": "Started running \"-c mkdir -p \/cromwell_root\/0c83f20c && cat \/cromwell_root\/cwl.output.json 2>\/dev\/null | jq -r '.. | .path? \/\/ .location? \/\/ empty | gsub(\\\"file:\/\/\\\"; \\\"\\\")'  > \/cromwell_root\/0c83f20c\/cwl_output_json_references.txt\"",
            "endTime": "2018-08-14T16:16:57.281247Z"
          },
          {
            "startTime": "2018-08-14T16:16:36.306767Z",
            "description": "Started running \"\/bin\/bash \/cromwell_root\/script\"",
            "endTime": "2018-08-14T16:16:37.108175Z"
          },
          {
            "startTime": "2018-08-14T16:14:03.091743Z",
            "description": "Stopped pulling \"google\/cloud-sdk:alpine\"",
            "endTime": "2018-08-14T16:14:03.127407Z"
          },
          {
            "startTime": "2018-08-14T16:13:25.071851Z",
            "description": "waiting for quota",
            "endTime": "2018-08-14T16:13:25.620872Z"
          },
          {
            "startTime": "2018-08-14T16:16:57.281247Z",
            "description": "Stopped running \"-c mkdir -p \/cromwell_root\/0c83f20c && cat \/cromwell_root\/cwl.output.json 2>\/dev\/null | jq -r '.. | .path? \/\/ .location? \/\/ empty | gsub(\\\"file:\/\/\\\"; \\\"\\\")'  > \/cromwell_root\/0c83f20c\/cwl_output_json_references.txt\"",
            "endTime": "2018-08-14T16:16:57.673071Z"
          },
          {
            "startTime": "2018-08-14T16:17:05.013311Z",
            "description": "Stopped running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" -m rsync -r \/google\/logs gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/pipelines-logs 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" -m rsync -r \/google\/logs gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/pipelines-logs; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\": sh: -q: unknown operand",
            "endTime": "2018-08-14T16:17:06.327716Z"
          },
          {
            "startTime": "2018-08-14T16:14:28.018319Z",
            "description": "Started pulling \"ubuntu@sha256:3f119dc0737f57f704ebecac8a6d8477b0f6ca1ca0332c7ee1395ed2c6a82be7\"",
            "endTime": "2018-08-14T16:14:32.007439Z"
          },
          {
            "startTime": "2018-08-14T16:13:55.499871Z",
            "description": "Started pulling \"google\/cloud-sdk:alpine\"",
            "endTime": "2018-08-14T16:14:03.091743Z"
          },
          {
            "startTime": "2018-08-14T16:14:35.424927Z",
            "description": "Started running \"\/bin\/sh -c mkdir -p $(dirname \/cromwell_root\/cgp-commons-multi-region-public\/topmed_open_access\/711e55a4-c3e9-50af-8a85-41829fb84cae\/NWD455342.recab.cram) && \/scripts\/dosUrlLocalizer.sc dos:\/\/dg.4503\/4d427aa3-5640-4f00-81ae-c33443f84acf \/cromwell_root\/cgp-commons-multi-region-public\/topmed_open_access\/711e55a4-c3e9-50af-8a85-41829fb84cae\/NWD455342.recab.cram\"",
            "endTime": "2018-08-14T16:16:33.026575Z"
          },
          {
            "startTime": "2018-08-14T16:16:33.026575Z",
            "description": "Stopped running \"\/bin\/sh -c mkdir -p $(dirname \/cromwell_root\/cgp-commons-multi-region-public\/topmed_open_access\/711e55a4-c3e9-50af-8a85-41829fb84cae\/NWD455342.recab.cram) && \/scripts\/dosUrlLocalizer.sc dos:\/\/dg.4503\/4d427aa3-5640-4f00-81ae-c33443f84acf \/cromwell_root\/cgp-commons-multi-region-public\/topmed_open_access\/711e55a4-c3e9-50af-8a85-41829fb84cae\/NWD455342.recab.cram\": ps:\/\/repo1.maven.org\/maven2\/io\/spray\/spray-json_2.12\/1.3.4\/spray-json_2.12-1.3.4-sources.jar.sha1\nDownloading https:\/\/repo1.maven.org\/maven2\/org\/scala-lang\/scala-library\/2.12.3\/scala-library-2.12.3-sources.jar\nDownloading https:\/\/repo1.maven.org\/maven2\/org\/scala-lang\/scala-library\/2.12.3\/scala-library-2.12.3-sources.jar.sha1\nDownloaded https:\/\/repo1.maven.org\/maven2\/org\/scala-lang\/scala-library\/2.12.3\/scala-library-2.12.3-sources.jar.sha1\nDownloaded https:\/\/repo1.maven.org\/maven2\/io\/spray\/spray-json_2.12\/1.3.4\/spray-json_2.12-1.3.4-sources.jar.sha1\nDownloaded https:\/\/repo1.maven.org\/maven2\/io\/spray\/spray-json_2.12\/1.3.4\/spray-json_2.12-1.3.4-sources.jar\nDownloaded https:\/\/repo1.maven.org\/maven2\/org\/scala-lang\/scala-library\/2.12.3\/scala-library-2.12.3-sources.jar\nCompiling \/scripts\/dosUrlLocalizer.sc #2\nSLF4J: Failed to load class \"org.slf4j.impl.StaticLoggerBinder\".\nSLF4J: Defaulting to no-operation (NOP) logger implementation\nSLF4J: See http:\/\/www.slf4j.org\/codes.html#StaticLoggerBinder for further details.",
            "endTime": "2018-08-14T16:16:33.718991Z"
          },
          {
            "startTime": "2018-08-14T16:13:13.678Z",
            "description": "RequestingExecutionToken",
            "endTime": "2018-08-14T16:13:14.570Z"
          },
          {
            "startTime": "2018-08-14T16:14:33.113135Z",
            "description": "Started running \"\/bin\/sh -c while true; do retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stderr gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stderr gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry 2> \/dev\/null || true; sleep 60; done\"",
            "endTime": "2018-08-14T16:14:33.759759Z"
          },
          {
            "startTime": "2018-08-14T16:13:14.755Z",
            "description": "RunningJob",
            "endTime": "2018-08-14T16:13:25.071851Z"
          },
          {
            "startTime": "2018-08-14T16:13:14.571Z",
            "description": "PreparingJob",
            "endTime": "2018-08-14T16:13:14.755Z"
          },
          {
            "startTime": "2018-08-14T16:13:14.570Z",
            "description": "WaitingForValueStore",
            "endTime": "2018-08-14T16:13:14.571Z"
          },
          {
            "startTime": "2018-08-14T16:16:37.479167Z",
            "description": "Started running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stdout gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stdout gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\"",
            "endTime": "2018-08-14T16:16:39.695039Z"
          },
          {
            "startTime": "2018-08-14T16:16:57.673071Z",
            "description": "Started running \"\/bin\/sh -c cat \/cromwell_root\/0c83f20c\/cwl_output_json_references.txt 2>\/dev\/null | xargs -I % sh -c 'gsutil -m  cp -r % gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/$(echo % | sed -e \\\"s\/^\\\\\/\/\/\\\") 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -m -u dos-testing cp -r % gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/$(echo % | sed -e \\\"s\/^\\\\\/\/\/\\\"); fi '\"",
            "endTime": "2018-08-14T16:16:57.822335Z"
          },
          {
            "startTime": "2018-08-14T16:14:33.759759Z",
            "description": "Started running \"\/bin\/sh -c while true; do retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stdout gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stdout gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry 2> \/dev\/null || true; sleep 60; done\"",
            "endTime": "2018-08-14T16:14:34.534911Z"
          },
          {
            "startTime": "2018-08-14T16:14:03.127407Z",
            "description": "Started pulling \"broadinstitute\/cromwell-dos:34-d8acfe3\"",
            "endTime": "2018-08-14T16:14:27.981711Z"
          },
          {
            "startTime": "2018-08-14T16:14:32.463087Z",
            "description": "Started running \"\/bin\/sh -c while true; do retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/google\/logs\/output gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/t.log 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/google\/logs\/output gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/t.log; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry 2> \/dev\/null || true; sleep 30; done\"",
            "endTime": "2018-08-14T16:14:33.113135Z"
          },
          {
            "startTime": "2018-08-14T16:17:00.937007Z",
            "description": "Started running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" -m rsync -r \/google\/logs gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/pipelines-logs 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" -m rsync -r \/google\/logs gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/pipelines-logs; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\"",
            "endTime": "2018-08-14T16:17:05.013311Z"
          },
          {
            "startTime": "2018-08-14T16:16:45.309551Z",
            "description": "Started pulling \"stedolan\/jq@sha256:a61ed0bca213081b64be94c5e1b402ea58bc549f457c2682a86704dd55231e09\"",
            "endTime": "2018-08-14T16:16:56.518719Z"
          },
          {
            "startTime": "2018-08-14T16:13:25.620872Z",
            "description": "Worker \"google-pipelines-worker-4247201c1820698c5c935fb230c4a278\" assigned in \"us-central1-f\"",
            "endTime": "2018-08-14T16:13:55.499871Z"
          },
          {
            "startTime": "2018-08-14T16:14:27.981711Z",
            "description": "Stopped pulling \"broadinstitute\/cromwell-dos:34-d8acfe3\"",
            "endTime": "2018-08-14T16:14:28.018319Z"
          },
          {
            "startTime": "2018-08-14T16:14:34.534911Z",
            "description": "Started running \"\/bin\/bash -c mkdir -p \/cromwell_root && chmod -R a+rwx \/cromwell_root\"",
            "endTime": "2018-08-14T16:14:34.810415Z"
          },
          {
            "startTime": "2018-08-14T16:16:45.288207Z",
            "description": "Stopped running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil   cp \/cromwell_root\/rc gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing  cp \/cromwell_root\/rc gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\": sh: -q: unknown operand",
            "endTime": "2018-08-14T16:16:45.309551Z"
          },
          {
            "startTime": "2018-08-14T16:16:39.695039Z",
            "description": "Stopped running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stdout gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stdout gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\": sh: -q: unknown operand",
            "endTime": "2018-08-14T16:16:40.069663Z"
          },
          {
            "startTime": "2018-08-14T16:14:32.007439Z",
            "description": "Stopped pulling \"ubuntu@sha256:3f119dc0737f57f704ebecac8a6d8477b0f6ca1ca0332c7ee1395ed2c6a82be7\"",
            "endTime": "2018-08-14T16:14:32.463087Z"
          },
          {
            "startTime": "2018-08-14T16:17:06.327716Z",
            "description": "Worker released",
            "endTime": "2018-08-14T16:18:14.462Z"
          },
          {
            "startTime": "2018-08-14T16:16:58.180303Z",
            "description": "Started running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/google\/logs\/output gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/t.log 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/google\/logs\/output gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/t.log; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\"",
            "endTime": "2018-08-14T16:17:00.575023Z"
          },
          {
            "startTime": "2018-08-14T16:13:13.678Z",
            "description": "Pending",
            "endTime": "2018-08-14T16:13:13.678Z"
          },
          {
            "startTime": "2018-08-14T16:16:42.510303Z",
            "description": "Stopped running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stderr gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/ 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/cromwell_root\/stderr gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\": sh: -q: unknown operand",
            "endTime": "2018-08-14T16:16:43.002063Z"
          },
          {
            "startTime": "2018-08-14T16:17:00.575023Z",
            "description": "Stopped running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/google\/logs\/output gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/t.log 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing -h \\\"Content-Type: text\/plain; charset=UTF-8\\\" cp \/google\/logs\/output gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/t.log; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\": sh: -q: unknown operand",
            "endTime": "2018-08-14T16:17:00.937007Z"
          },
          {
            "startTime": "2018-08-14T16:16:33.718991Z",
            "description": "Started running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  cp gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/script \/cromwell_root\/script 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing cp gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/script \/cromwell_root\/script; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\"",
            "endTime": "2018-08-14T16:16:35.937599Z"
          },
          {
            "startTime": "2018-08-14T16:16:35.937599Z",
            "description": "Stopped running \"\/bin\/sh -c retry() { for i in `seq 3`; do gsutil  cp gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/script \/cromwell_root\/script 2> gsutil_output.txt; RC_GSUTIL=$?; if [[ \\\"$RC_GSUTIL\\\" -eq 1 && grep -q \\\"Bucket is requester pays bucket but no user project provided.\\\" gsutil_output.txt ]]; then\\n echo \\\"Retrying with user project dos-testing\\\" && gsutil -u dos-testing cp gs:\/\/fc-f5576422-7954-4da1-8005-30c2df8d37d5\/984b5570-abe7-470f-b5cc-9243bf98518c\/w\/f8a1e7ee-3286-4071-a1d6-e68667b732de\/call-t\/script \/cromwell_root\/script; fi ; RC=$?; if [[ \\\"$RC\\\" -eq 0 ]]; then break; fi; sleep 5; done; return \\\"$RC\\\"; }; retry\": sh: -q: unknown operand",
            "endTime": "2018-08-14T16:16:36.306767Z"
          },
          {
            "startTime": "2018-08-14T16:18:14.462Z",
            "description": "UpdatingJobStore",
            "endTime": "2018-08-14T16:18:15.459Z"
          },
          {
            "startTime": "2018-08-14T16:14:34.810415Z",
            "description": "Stopped running \"\/bin\/bash -c mkdir -p \/cromwell_root && chmod -R a+rwx \/cromwell_root\"",
            "endTime": "2018-08-14T16:14:35.424927Z"
          },
          {
            "startTime": "2018-08-14T16:16:37.108175Z",
            "description": "Stopped running \"\/bin\/bash \/cromwell_root\/script\"",
            "endTime": "2018-08-14T16:16:37.479167Z"
          },
          {
            "startTime": "2018-08-14T16:16:56.518719Z",
            "description": "Stopped pulling \"stedolan\/jq@sha256:a61ed0bca213081b64be94c5e1b402ea58bc549f457c2682a86704dd55231e09\"",
            "endTime": "2018-08-14T16:16:56.948463Z"
          }
        ],

Although there's a lot of events -- there are a few key stages here:

  1. Waiting for Quota -- which is its own independent execution event and it should remain that way.
  2. Data localization -- it seems like since the execution events grow with the number of input files, all execution events related to copying files onto the VM should condense all unique copy actions into one aggregated localization event.
  3. Pulling docker -- time pulling the docker used by the command. Often this can be a huge docker so its worth treating it as a first-class execution event to watch its performance.
  4. Running command -- this is the time spent running the script.
  5. Data delocalization -- fully analogous to data localization explained above.

AC: Condense the execution events generated by the PAPI into those 5 categories inside the Cromwell metadata.

screen shot 2018-09-27 at 3 08 10 am

cjllanwarne commented 6 years ago

Note: I think a tech talk or PO choice would be valuable on whether this grouping is happening "pre-metadata", "as an endpoint option" or "in visualization code".

FWIW I would lean towards one of the latter two options

ruchim commented 6 years ago

I definitely like the idea of pre-metadata as well.

geoffjentry commented 6 years ago

Only if it does not involve retroactively applying it to old events.

So if you need it to apply to old data it needs to be post-metadata

cjllanwarne commented 6 years ago

Oh, I was actually saying I prefer the other two (ie not "pre-metadata" 😳) - the preference being that we record everything, and then choose how to display it later

ruchim commented 6 years ago

Ah! you mean evolve the timing diagram logic to group execution events?