populationgenomics / production-pipelines

Genomics workflows for CPG using Hail Batch
MIT License
2 stars 0 forks source link

Pass AIP multiple SV VCFs #715

Closed MattWellie closed 2 months ago

MattWellie commented 2 months ago

Allows for a single AIP run to absorb multiple SV VCFs

This accommodates the gCNV non-overlapping CNV VCFs, where a cohort may be tiled across multiple separate batch VCFs

  1. Pulls out Pedigree generation into a Stage with a single result. This was done twice in 2 different places per dataset previously
  2. 'find SV files from metamist' method has gone from returning None or a single file, to returning a list of [full path, filename only]. This list can be empty
  3. SV labelling stage now has the ability to output multiple files, one per input SV
  4. The MOI testing stage can take multiple SV VCFs, and process all together

The complementary change is already in AIP, I snuck it in during the pre-commit re-work