Instead of using temporary files we previously created ad hoc to obtain sex information, use most up-to-date info from the Sample QC part from the large cohort pipeline (note: these are Hail Table files, rather than CSV/TSV).
At the moment, the more reliably up-to-date files are cohort specific (@katiedelange is this right?), but eventually I anticipate a single tenk10k file so I have added that syntax (commented out) for the future.
Instead of using temporary files we previously created ad hoc to obtain sex information, use most up-to-date info from the Sample QC part from the large cohort pipeline (note: these are Hail Table files, rather than CSV/TSV).
At the moment, the more reliably up-to-date files are cohort specific (@katiedelange is this right?), but eventually I anticipate a single
tenk10k
file so I have added that syntax (commented out) for the future.Using these files which are in
main-analysis
also allows us to run withstandard
permission, see discussion in this PR: https://github.com/populationgenomics/saige-tenk10k/pull/143