Closed fnothaft closed 7 years ago
We discussed this earlier and decided it didn't make sense to store a quality score per variant because of our split-allelic model. Has that argument changed?
Yeah, I think that may've been a miscommunication. I don't think there's a meaningful way to generate a "correct" quality score for multiallelic sites that we split, but there's a lot of benchmarking tooling that relies on the variant quality being there, and the score is meaningful in the common case (a biallelic variant).
I may have also just changed my opinion over time. That said, I need the field. :-p
I don't think there's a meaningful way to generate a "correct" quality score for multiallelic sites that we split...
Right, that was the reason for dropping several fields that are reserved keys in the VCF specification from our formats. I don't have a problem bringing Variant.quality
back, but what should we do in the splitting case?
Yeah, I'm not 100% sure there, but since we set the splitFromMultiallelic
field to true, I'm currently leaning towards preserving the value and letting splitFromMultiallelic == true
be advisory that the value might be wrong.
Needed for https://github.com/bigdatagenomics/avocado/issues/253