I have done an analysis with Deepvariant variants called from a genome in a bottle sample. I did the analysis with hap.py with a singularity pulled docker taken from docker://pkrusche/hap.py on the benchmark giab vcf.
I am however confused by the results. I added the option --roc, because this is the only option I could find (not a pr curve option?). However, I found in the documents that precision and recall are calculated, this is also what I see as column names in the output (see first two rows and header) and not roc metrics:
How is it possible to have a Recall of 0 and Precision of 1? Unless this is just wrongly labelled metrics and should be TPR and FPR and it is supposed to be a ROC plot? Like the flag says. The plot I made also looks like it should be a ROC.
Additionally, if I plot the METRIC.Recall and METRIC.Precision from the roc files, I get a plot that follows a typical ROC form, while if I plot the values as also calculated in happy.md, I get a different plot and one that does look more like a PR curve:
Recall = TRUTH.TP / (TRUTH.TP + TRUTH.FN)
Precision = QUERY.TP / (QUERY.TP + QUERY.FP)
Dear @pkrusche and team,
I have done an analysis with Deepvariant variants called from a genome in a bottle sample. I did the analysis with hap.py with a singularity pulled docker taken from docker://pkrusche/hap.py on the benchmark giab vcf.
I used the following command (in snakemake rule):
shell: "export HGREF={input.ref_genome}; /opt/hap.py/bin/hap.py {input.truth_vcf} {input.query_vcf} --false-positives {input.confidence_bed} --target-regions {input.target_bed} -r {input.ref_genome} --roc QUAL --roc-filter RefCall -o {params.prefix} -V --engine=vcfeval --engine-vcfeval-template {input.ref_sdf} --threads {threads} --logfile {log}"
I am however confused by the results. I added the option --roc, because this is the only option I could find (not a pr curve option?). However, I found in the documents that precision and recall are calculated, this is also what I see as column names in the output (see first two rows and header) and not roc metrics:
How is it possible to have a Recall of 0 and Precision of 1? Unless this is just wrongly labelled metrics and should be TPR and FPR and it is supposed to be a ROC plot? Like the flag says. The plot I made also looks like it should be a ROC.
Additionally, if I plot the METRIC.Recall and METRIC.Precision from the roc files, I get a plot that follows a typical ROC form, while if I plot the values as also calculated in happy.md, I get a different plot and one that does look more like a PR curve: Recall = TRUTH.TP / (TRUTH.TP + TRUTH.FN) Precision = QUERY.TP / (QUERY.TP + QUERY.FP)
Thank you in advance, Eva