Recent changes to the way we create Pedigree files meant that all members of a family were being kept, even those with no data. This breaks the MOI tests, as when evaluating MOI in other members, we were trying to query for relatives not in the VCF(s).
Compute overheads with highly fragmented Hail Matrix Tables was causing every task to fail. It looks like MTs with HUGE partition counts carry too much overhead when we're now running with finite compute. These same jobs were spinning up 00ks of jobs in QOB to process MTs, which masked the mountainous strain.
Proposed Changes
Ped members who do not have an internal ID (as an indicator for presence in the variant data) are masked with '0's instead of being retained with an external ID
Upon reading the MT we check the degree of fragmentation - if it's too high (arbitrary threshold at 10k partitions) we squash partitions down to CPU * 10
A new method has been included to do an early as possible of all variants not within our run region of interest. This was initially included when trying to create an earlier checkpoint with less computational overhead (now solved through repartitioning), but the runtime looks improved.
Fixes
Proposed Changes
Tested
Checklist