broadinstitute / cromwell

Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
http://cromwell.readthedocs.io/
BSD 3-Clause "New" or "Revised" License
988 stars 357 forks source link

timings file #6713

Open joshfactorial opened 2 years ago

joshfactorial commented 2 years ago

When running cromwell (I've tried using versions 34, 77, and 29) in server mode on RHEL linux cluster, I am having an issue with the timings file that cromwell generates. Namely, I am not able to expand subworkflows, where before I was. I'm using a curl command to query the api.

curl -s -X GET "http://<host>:<port>/api/workflows/v1/<UUID>/timing -v "%{http_code}" -o timings.html

Looking at the resulting timings.html file, I see that the metadata does not include any subworkflow information, though there does appear to be code to expand out subworkflows still. So I am wondering if I need to modify the command? I found in the docs mention of expandSubWorkflows with the metadata api call:

GET api/workflows/v2/1d919bd4-d046-43b0-9918-9964509689dd/metadata?expandSubWorkflows=true

I naively tried to add the variable to the timing call, but it didn't change the result.

An example from the workflow:

call ALIGNMENT.RunAlignmentTask as align {
      input:
         InputReads = AlignInputReads
    }

with RunAlignment being:

workflow RunAlignmentTask {

    Array[Array[File]] InputReads   # One lane per subarray with one or two input reads
    Array[String] PlatformUnit      # One platform unit per alignment task
    Boolean PairedEnd               # Variable to check if single ended or not

    String SampleName               # Name of the Sample

    Array[Int] Indexes = range(length(InputReads))

    scatter (idx in Indexes) {

        if(PairedEnd) {
            call ALIGN.alignmentTask as ALIGN_paired {
                input:
                    SampleName=SampleName,
                    InputRead1=InputReads[idx][0],
                    InputRead2=InputReads[idx][1],
                    PlatformUnit=PlatformUnit[idx]
            }
        }

        if(!PairedEnd) {
            call ALIGN.alignmentTask as ALIGN_single {
                input:
                    SampleName=SampleName,
                    InputRead1=InputReads[idx][0],
                    InputRead2="null",
                    PlatformUnit=PlatformUnit[idx]
            }
        }

        File ResultBam = select_first([ALIGN_paired.OutputBams, ALIGN_single.OutputBams])
        File ResultBai = select_first([ALIGN_paired.OutputBais, ALIGN_single.OutputBais])
    }

    output {
        # Unify outputs from scatter and filter out null entries
        Array[File] OutputBams = ResultBam
        Array[File] OutputBais = ResultBai
    }

}

There are other subworkflows with numerous metrics tasks and so forth. Is there a command to get the subworkflow breakdown to work in timings.html? Or will I need to wrangle a call to metadata into the timings file to make it work?

joshfactorial commented 2 years ago

Also, I tried going to the jira page as the issues template suggests, but it tells me I don't have permission to see that page.