charite / jannovar

Annotation of VCF variants with functional impact and from databases (executable+library)
http://jannovar.readthedocs.io/en/master/
Other
58 stars 35 forks source link

Incorrect InheritanceMode calculation for multisample VCF with multiple alleles. #284

Open julesjacobsen opened 7 years ago

julesjacobsen commented 7 years ago

Consider the following:

##fileformat=VCFv4.2
##contig=<ID=1,length=249250621>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Cain    Adam    Eva Abel
1   145513533   .   T   C,A 123.15  PASS    GENE=RBM8A  GT:DP   1/2:21  0/1:33  0/1:33  1/1:33
PedPerson probandPerson = new PedPerson("Family", "Cain", "Adam", "Eve", Sex.MALE, Disease.AFFECTED, new ArrayList<String>());
PedPerson brotherPerson = new PedPerson("Family", "Abel", "Adam", "Eve", Sex.MALE, Disease.UNAFFECTED, new ArrayList<String>());
PedPerson motherPerson = new PedPerson("Family", "Eve", "0", "0", Sex.FEMALE, Disease.UNAFFECTED, new ArrayList<String>());
PedPerson fatherPerson = new PedPerson("Family", "Adam", "0", "0", Sex.MALE, Disease.UNAFFECTED, new ArrayList<String>());
Pedigree pedigree = buildPedigree(probandPerson, brotherPerson, motherPerson, fatherPerson);

This should be flagged as the A allele being autosomal dominant. Instead the compatibility list is empty.

Similarly for AR:

##fileformat=VCFv4.2
##contig=<ID=1,length=249250621>
#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  Cain    Adam    Eva
1   145513533   .   T   C,A 123.15  PASS    GENE=RBM8A  GT:DP   2/2:21  0/1:33  1/1:33
PedPerson probandPerson = new PedPerson("Family", "Cain", "Adam", "Eve", Sex.MALE, Disease.AFFECTED, new ArrayList<String>());
PedPerson motherPerson = new PedPerson("Family", "Eve", "0", "0", Sex.FEMALE, Disease.UNAFFECTED, new ArrayList<String>());
PedPerson fatherPerson = new PedPerson("Family", "Adam", "0", "0", Sex.MALE, Disease.UNAFFECTED, new ArrayList<String>());
Pedigree pedigree = buildPedigree(probandPerson, motherPerson, fatherPerson);

I think much of this comes from the assumption in testing that there are no multiple alleles at a given position and also in the Genotype class that there can only be alleles with values -1, 0 or 1. This is not always the case in a multi-sample VCF.

See the InheritanceModeAnalyserTest in Exomiser- there are two tests there marked with @Ignore which fail due to this.

holtgrewe commented 7 years ago

I think @visze has to look at this

holtgrewe commented 7 years ago

bumping milestone...

visze commented 7 years ago

Year I am aware of that. This will need a complete refactoring of the actual ineritance-mode checker. Right now I have no time for this. But maybe in the future

pnrobinson commented 7 years ago

@visze If you realistically have no time for this, then somebody else should work on this. There was also another issue with the recessive inheritance checker that we had been discussing. Perhaps Jules and I could take a look at this part of the code and try to find a solution? I think we need to make sure the analysis is correct.

visze commented 7 years ago

I have really realistically no time for that right now!

Which is the other Issue? I am only aware of #291