issues
search
RTIInternational
/
biocloud_gwas_workflows
3
stars
2
forks
source link
Create workflow specs for QC of genotype array data
#3
Open
ngaddis
opened
4 years ago
ngaddis
commented
4 years ago
Overview of process:
Starting point: PLINK format, PLUS strand GRCh37
Convert variants to IMPUTE2 ID format
Remove duplicate IDs (based on call rate)
Flag individuals missing chrX or other chromosome
Remove phenotype info in FAM file
Format phenotype data to standard format
Structure workflow (#4)
Partition data by ancestry
Call rate filter
HWE filter
Subject call rate filter (based on autosomes)
Relatedness workflow (#5)
Remove samples based on relatedness
Sex check and sample removal
Excessive homozygosity filtering
Set het haploids to missing
Overview of process: