I have implemented a sample qc notebook. This notebook analyses sample depth, missing calls, and heterozygosity (also autosome/sex chromosome depth ratio, but not used yet), to exclude low-quality samples.
It then writes a new metadata file to results/config/metadata.qcpass.tsv, which is then the input metadata for all further analysis steps, such as allele freqs, pca, and ag-vampir modules.
QC steps, such as coverage, should still analyse the poor quality samples for completeness.
We also get bcftools call to output GQ/GP fields, which should be useful for variant filtering. I will add variant filters in this or another PR.
TODO:
[x] add cell tags and tidy up notebook, add to Jupyter book.
[x] In analysis notebooks, need to add the code to restrict VCF inputs to high-quality samples
Addresses #28
I have implemented a sample qc notebook. This notebook analyses sample depth, missing calls, and heterozygosity (also autosome/sex chromosome depth ratio, but not used yet), to exclude low-quality samples.
It then writes a new metadata file to
results/config/metadata.qcpass.tsv
, which is then the input metadata for all further analysis steps, such as allele freqs, pca, and ag-vampir modules.QC steps, such as coverage, should still analyse the poor quality samples for completeness.
We also get bcftools call to output GQ/GP fields, which should be useful for variant filtering. I will add variant filters in this or another PR.
TODO: