iqbal-lab-org / minos

Variant call adjudication
MIT License
16 stars 5 forks source link

Null genotypes missing GT_CONF_PERCENTILE value #78

Closed mbhall88 closed 5 years ago

mbhall88 commented 5 years ago

I have a multisample VCF from minos v0.5.1 where some samples are missing the GT_CONF_PERCENTILE field. It seems to only be on samples with a null genotype (./.) An example of the INFO string

GT:DP:COV:GT_CONF:GT_CONF_PERCENTILE:STATUS

and an example incorrect sample entry

./.:0:0,0:0.0:FAIL
bricoletc commented 5 years ago

Hey @mbhall88 , using the latest version of minos (0.9.1; your issue uses to 0.5.1 from 18th March), this problem is no longer present.

Meaning that a './.' GT field leads to no GT_CONF_PERCENTILE field, as per https://github.com/iqbal-lab-org/minos/blob/30e526e4652a96c41391072aa3304409807a310e/minos/adjudicator.py#L187-L188

An example FORMAT field on data I worked on today: DP:GT:COV:GT_CONF 0:./.:0,0:0.0

OK to close?

iqbal-lab commented 5 years ago

no - it turns out a VCF needs to have the same things in all the FORMAT fields - which from my point of view is craxy - might as well have it in the header. i think we will need, for downstream people, to just have a GT_CONF_PERCENTILE of 0 when there is no call, and just ignore it. i prefer our current behaviour in terms of logic, but it breaks too many tools

iqbal-lab commented 5 years ago

i think that was a race condition between @mbhall88 and me

bricoletc commented 5 years ago

I see. That is strange that FORMAT should be the same across records seems to defy the point.

Maybe have '.' rather than '0' for no calls though?

mbhall88 commented 5 years ago

I think this would cause an issue as the field is likely defined as a number?

iqbal-lab commented 5 years ago

yes i think it has to be the right type