Open aarthi-mohan opened 3 years ago
Hi Aarthi,
Thank you for the question. EHv4 uses a different STR genotyping algorithm than EHv3. These two versions can output discordant genotypes in cases when there are many poorly/ambiguously aligning reads.
Could you please visualize both results with REViewer? It will make it easier to compare the two genotype calls.
Best wishes, Egor
Hi there,
I am trying to update Expansion hunter from 3.0.1 to v4.0.2 and found some differences in repeat size genotyping between the 2 version for few of the loci we have. As an example, HRAS locus get genotyped as 7/9 in v3.0.1 and 11/11 in v4.0.2. I have used the same BAM (NA12878 WGS) input for both version and get same counts of reads for this location from the realigned.bam file.
v3.0.1
"HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT": { "AlleleCount": 2, "Coverage": 46.7027027027027, "LocusId": "HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT", "ReadLength": 150, "Variants": { "HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT": { "CountsOfFlankingReads": "(0, 12), (1, 16), (2, 18), (3, 25), (4, 18), (5, 1)", "CountsOfInrepeatReads": "(3, 1), (4, 42), (5, 18)", "CountsOfSpanningReads": "()", "Genotype": "7/9", "GenotypeConfidenceInterval": "6-9/6-12", "ReferenceRegion": "11:530405-531236", "RepeatUnit": "GGCGTCCCCTGGAGAGAAGGGCGAGTGT", "VariantId": "HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT", "VariantSubtype": "Repeat", "VariantType": "Repeat" } } }
v4.0.2
"HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT": { "AlleleCount": 2, "Coverage": 38.59459459459459, "FragmentLength": 431, "LocusId": "HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT", "ReadLength": 150, "Variants": { "HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT": { "CountsOfFlankingReads": "(1, 14), (2, 14), (3, 18), (4, 27), (5, 20), (6, 4)", "CountsOfInrepeatReads": "(5, 1), (6, 43), (7, 17)", "CountsOfSpanningReads": "(4, 2)", "Genotype": "11/11", "GenotypeConfidenceInterval": "9-14/11-17", "ReferenceRegion": "11:530405-531236", "RepeatUnit": "GGCGTCCCCTGGAGAGAAGGGCGAGTGT", "VariantId": "HRAS***h-Ras:_increased_risk_of_ovarian_cancer***11:530405-531236***GGCGTCCCCTGGAGAGAAGGGCGAGTGT", "VariantSubtype": "Repeat", "VariantType": "Repeat" } } }
Appreciate your thoughts on why this would happen.
Thank you, Aarthi