mbhall88 / head_to_head_pipeline

Snakemake pipelines to run the analysis for the Illumina vs. Nanopore comparison.
GNU General Public License v3.0
5 stars 2 forks source link

Compare pandora calls to truth assemblies #49

Closed mbhall88 closed 3 years ago

mbhall88 commented 4 years ago

This analysis is analogous to #42. However, we will check the SNP-only VCF and the "full" VCF.

mbhall88 commented 3 years ago

A lot of the investigation for this has been happening in #48.

However I wanted to add some info about indels here as this will be important for #8

In https://github.com/rmcolq/pandora/issues/232#issuecomment-732756241 @iqbal-lab asked

Instead of snps/All could we see SNPs/indels?

However, varifier assesses all variant types. To get around this, I pulled out the indels from the varifier recall VCF and assessed those.

Recall

So I have a fairly crude measure of the indel-only recall. (This measure counts Partial_TP as a TP, not sure if this is correct or not but it is easily changed)

mada_104 recall: 62.60%
mada_1-44 recall: 70.16%
mada_130 recall: 80.00%
mada_132 recall: 72.73%
mada_116 recall: 75.94%
mada_125 recall: 69.68%
mada_102 recall: 71.64%

A plot of the varifier classifications for each sample

image

Precision

mada_104 precision: 23.87%
mada_1-44 precision: 48.35%
mada_130 precision: 15.68%
mada_132 precision: 6.91%
mada_116 precision: 27.38%
mada_125 precision: 31.27%
mada_102 precision: 35.56%

A plot of the varifier classifications

image