Konrad suggested an analysis to check whether filtering to samples that share a rare doubleton will reduce the search space for calculating relatedness inference. This PR adds two scripts that find samples that share rare doubletons in the 455k UKB VDS and compares these sample pairs to the related samples found when running sample QC on the original 455k UKB callset.
Major changes:
tgg/relatedness/doubleton_utils.py: Script with functions to get high quality doubleton sites, to get sample IDs for pairs that share doubletons, and to compare doubletons to related samples
tgg/relatedness/doubletons.py: Script that calls function in doubleton_utils
Konrad suggested an analysis to check whether filtering to samples that share a rare doubleton will reduce the search space for calculating relatedness inference. This PR adds two scripts that find samples that share rare doubletons in the 455k UKB VDS and compares these sample pairs to the related samples found when running sample QC on the original 455k UKB callset.
Major changes:
tgg/relatedness/doubleton_utils.py
: Script with functions to get high quality doubleton sites, to get sample IDs for pairs that share doubletons, and to compare doubletons to related samplestgg/relatedness/doubletons.py
: Script that calls function indoubleton_utils
Minor changes: