talkowski-lab / gnomad-sv-pipeline

Code and custom scripts relevant to gnomAD-SV (Collins*, Brand*, et al., 2020)
https://broad.io/gnomad_sv
MIT License
36 stars 14 forks source link

Incomplete task inputs in workflow #2

Open yoyosef opened 5 years ago

yoyosef commented 5 years ago

https://github.com/talkowski-lab/gnomad-sv-pipeline/blob/master/gnomad_sv_pipeline_wdls/module_01/01_depth_clustering_by_chrom.wdl

I noticed that the call on make_rdtest_bed didn't include a script as an input, is this by mistake?

call make_rdtest_bed {
    input:
      dels=cluster_DELs.clustered_bed,
      dups=cluster_DUPs.clustered_bed,
      batch=batch,
  }

  call make_depth_vcf {
    input:
      bed=make_rdtest_bed.bed,
      batch=batch,
      contigs=contigs
  }

  call vcf_qc.master_vcf_qc as vcf_qc {
    input:
      vcf=make_depth_vcf.vcf,
      famfile=trios_famfile,
      ref_build=ref_build,
      prefix="${batch}_clustered_depth_vcf",
      sv_per_shard=10000,
      samples_per_shard=100,
      Sanders_2015_tarball=Sanders_2015_tarball,
      Collins_2017_tarball=Collins_2017_tarball,
      Werling_2018_tarball=Werling_2018_tarball
  }

  output {
    File clustered_vcf = make_depth_vcf.vcf
    File clustered_vcf_qc = vcf_qc.sv_vcf_qc_output
  }
}

task make_rdtest_bed { 
  File dels
  File dups
  File script
  String batch

  command <<<
    cat \
        <(python3 ${script} ${dels} | sed '1d') \
        <(python3 ${script} ${dups} | sed '1d') \
      | sort -k1,1V -k2,2n \
      | cat <(echo -e "#chrom start end name samples svtype" | sed -e 's/ /\t/g') - \
      > ${batch}.depth.bed;
  >>>

  output {
    File bed = "${batch}.depth.bed"
}
yoyosef commented 5 years ago

Additionally,

https://api.firecloud.org/ga4gh/v1/tools/Talkowski-SV:master_SV_VCF_QC/versions/47/plain-WDL/descriptor (which is imported in the wdl file) is a dead link.