sanjaynagi / AmpSeeker

A state-of-the-art snakemake workflow for amplicon sequencing
https://sanjaynagi.github.io/AmpSeeker/
0 stars 3 forks source link

sample-qc infrastructure #106

Closed sanjaynagi closed 5 months ago

sanjaynagi commented 5 months ago

Addresses #28

I have implemented a sample qc notebook. This notebook analyses sample depth, missing calls, and heterozygosity (also autosome/sex chromosome depth ratio, but not used yet), to exclude low-quality samples.

It then writes a new metadata file to results/config/metadata.qcpass.tsv, which is then the input metadata for all further analysis steps, such as allele freqs, pca, and ag-vampir modules.

QC steps, such as coverage, should still analyse the poor quality samples for completeness.

We also get bcftools call to output GQ/GP fields, which should be useful for variant filtering. I will add variant filters in this or another PR.

TODO: