This PR includes a few additions and changes for computations over interval lists. The main components are:
Added ComputeIntervalBamStats.wdl, which is a workflow that takes in a bam file, a list of interval lists, and some other metadata which computes Picard's CollectWgsMetrics on the bam over those intervals, along with a mapping quality distribution for each region. These statistics are then collected and labeled over the groups into a few output tsvs that are ready to be visualized (dashboard coming soon, maybe!). The input_name and experiment inputs are added as columns to the output, to allow concatenation of outputs across multiple sample runs and easy grouping into experiment groups when plotting.
Added a simple docker with samtools, which is used in the above workflow.
Fixed some naming conventions/locations for interval files previously added. In particular, each interval file has an incarnation as a bed file, a classic interval_list with hg38 seq dict header, and an interval_list with a restricted hg38 seq dict to exclude alt-contigs. This last one was mostly relevant to some experiments I was doing without alt contigs, but might be more generally useful for workflows which try to avoid those contigs and don't want fussy tools to complain about mismatched sequence dicts.
This PR includes a few additions and changes for computations over interval lists. The main components are:
ComputeIntervalBamStats.wdl
, which is a workflow that takes in a bam file, a list of interval lists, and some other metadata which computes Picard's CollectWgsMetrics on the bam over those intervals, along with a mapping quality distribution for each region. These statistics are then collected and labeled over the groups into a few output tsvs that are ready to be visualized (dashboard coming soon, maybe!). Theinput_name
andexperiment
inputs are added as columns to the output, to allow concatenation of outputs across multiple sample runs and easy grouping intoexperiment
groups when plotting.samtools
, which is used in the above workflow.