Closed holtjma closed 4 years ago
Thank you for testing and evaluating MapCaller. I think this should be a bug. I did not output GT field for indels. I'll fix this bug.
I've modified the VCF output format and updated MapCaller (v0.9.9.8). Please test and evaluate MapCaller again and let me know if there is any suggestions. Thank you very much.
BTW, could you provide me your test sets so that I can analyze the results? Thank you!
I'll put it on my to-do list to run with the newer version and get back to you within the next couple days hopefully.
The sequencing runs I'm using were sequenced internally, but they're all replicates of NA12878 and I used the Genome in a Bottle truth set for evaluation. I might be able to share the exact runs if you're interested, but it's not something I'm authorized to do myself.
New results look better, looks like adding the indel GT fields increasing precision/recall by about 10% each. Most of the remaining ones I'm simply not finding entries for in MapCaller's output (i.e. true false negatives for v0.9.9.8):
Threshold True-pos-baseline True-pos-call False-pos False-neg Precision Sensitivity F-measure
----------------------------------------------------------------------------------------------------
None 3284554 3383851 218011 257933 0.9395 0.9272 0.9333
Threshold True-pos-baseline True-pos-call False-pos False-neg Precision Sensitivity F-measure
----------------------------------------------------------------------------------------------------
None 3294325 3394001 214305 248162 0.9406 0.9299 0.9352
Threshold True-pos-baseline True-pos-call False-pos False-neg Precision Sensitivity F-measure
----------------------------------------------------------------------------------------------------
None 3367435 3472763 204531 175052 0.9444 0.9506 0.9475
Thank you for letting me know the evaluation result. There is one more thing I'd like to say about the evaluation. According to my experience, some of the indel events are ambiguous. There are more than one alignments including such indels having the same alignment scores. Therefore, the predicted positions of those indel events maybe a few bases away from the annotated ones. I was wondering if your evaluation metric considers such ambiguity. Anyway, I appreciate the evaluation you did for me. Thank you!!!
Yep, RTG vcfeval (https://github.com/RealTimeGenomics/rtg-tools) is actually really good at handling indels (that's the main reason I use it for evaluation) and is one of the standards used for this type of thing. The other one I see a lot is hap.py (https://github.com/Illumina/hap.py) which wraps RTG but has another evaluation method.
Hello,
As mentioned in Issue #19, I'm evaluating MapCaller using RTG vcfeval along with the GIAB truth set. To get it to run, I wrote a quick script to reformat the samples moving the GT, AD, and DP fields from INFO into a FORMAT and sample field. I also made the assumption that the reference allele depth would equal "DP - AD" (please correct if that's a bad assumption given the current output format).
I ran it on three replicates of NA12878 using b38 (with the HLA and alt contigs, ~3300 contigs total) and the results were less than expected with regards to sensitivity (for comparison, most pipelines I've tested are over 99% for sensitivity with higher variability in precision):
One of the main issues I saw was a lack of a GT call for several of the fields. For example, this was a line in one of the outputs where I would have expected a GT of "1/1", but the information was completely absent:
I don't know whether this is a bug, user error on my part, or working as intended (i.e. sometimes GT field is simply missing). Regardless, it would be good to be able to force a GT call for the purpose of evaluation.