broadinstitute / tgg_methods

Repo for miscellaneous methods developed by the methods group that don't fit anywhere else
MIT License
4 stars 0 forks source link

Comprehensive sex & relatedness checks on DRAGEN/GVS callset #87

Closed matren395 closed 2 months ago

matren395 commented 5 months ago

Once GP/DSP/PMs deliver more comprehensive information on sex & relatedness checks, I'll perform a one-time comprehensive check making sure that 1) the biological sexes in Seqr 2) biological sexes in the returned sex checks 3) biological sexes inferred from ploidiness in the delivered DRAGEN callset all agree. This will be done in the context of doing a one-time in-detail qc-ing of the new callset. In the future this will be productionized via https://app.zenhub.com/workspaces/tgg-methods-6613e5b68c36e00025172757/issues/gh/broadinstitute/tgg_methods/86

matren395 commented 2 months ago

These have been written into the pipeline by myself and (moreso) Ben Blankenmeister. They involve reading the metrics.tsv file that the PMs/GP return, containing Reported and Preditec Sex , and ensuring that they match. Relatedness checks are unchanged and still automated, since they already filtered to autosomes previously.

And we were able to manually check, and the delivered metrics.tsv (or, the data exported from the Terra workspace containing these metrics) didn't have a disproportionate amount of disagreements between the Predicted and Reported Sex. We were able to return the same results when we imputed the sex ourselves (looking at haploid/diploid behavior) and pulling the reported sex ourselves (from the Seqr pedigree).