Closed MattWellie closed 1 year ago
Note - this can be done now, but it will make more sense (and be a cleaner base to build on) once the de novo
branch is merged. Currently that is held up due to a memory leak problem in Hail Query, which we are looking to mitigate
Hail Query issue solved, de novo branch is available for review. The HTML report generation is stacked on top of that.
The only difference in Hail between running as family and singleton is the presence/absence of de novo results. That doesn't seem like enough reason to duplicate the intermediate files (and expense of generating/storing them). Instead, will create a flag in the final processing to remove the de novo field for all variants if we choose to run as singleton analysis.
Will still need both a family and singleton version of the Ped/Fam file to ensure the MOI tests are run appropriately
If two PED files are supplied, this will run in both singleton and familial mode
During this development period, it's important to gauge performance of the tool when exposed to both family and singleton cohorts, as both are anticipated use-cases once deployed.
e.g. Singleton analysis will create more False-positives, due to the lack of control family members, but De Novo variants can only be found in a trio/family cohort.
To simulate this, run each test cohort (here just Acute Care) in both different ways, through manipulating the PEDigree used.
31 contains a script for generating a PED from the Sample Metadata API, including a flag to set all members to unrelated singletons
Alterations required (non-mandatory, so original flow is unaffected):