broadinstitute / gatk-sv

A structural variation pipeline for short-read sequencing
BSD 3-Clause "New" or "Revised" License
162 stars 71 forks source link

Change default behavior of GatherSampleEvidence to skip running module metrics #633

Open VJalili opened 5 months ago

VJalili commented 5 months ago

The GatherSampleEvidence workflow is set to run GatherSampleEvidenceMetrics by default. However, running that task requires setting values for the optional inputs primary_contigs_fai and sv_pipeline_base_docker. Therefore, the GatherSampleEvidence workflow currently fails if only the required input is provided.

Since WDL does not support making the optional inputs as required w.r.t to the value of another argument, and Terra UI lacks features to hint to a user about such a chain of requirements, in-line comments in code seem the only option, as it is implemented. However, in-code comments are less user-friendly and should be reserved for developers and power users only.

Skipping to run this task as a default behavior results in the successful execution of the workflow if only the required input is provided seems to be a better design and improved UX.

epiercehoffman commented 5 months ago

The default input configuration in our template JSONs and Terra workspace includes all inputs required to run module metrics in the workflows where it is enabled by default. I honestly don't think it's a big issue, because we don't expect users to configure inputs from scratch - the inputs are complicated in other ways too. I'm definitely open to disabling module metrics collection by default, and to any changes that improve the clarity of inputs, but I think for now the real solution in terms of ease of use here is to pre-configure the inputs, which we have done.

VJalili commented 5 months ago

It is a better UX to have the workflows run successfully if only the required input is provided. It seems we can get this covered following Mark's suggestion without changing the default behavior.

mwalker174 commented 2 months ago

We should disable module metrics across the pipeline. We don't use them in production and have just been useful for development in a few rare cases.