PharmGKB / PharmCAT

The Pharmacogenomic Clinical Annotation Tool
Mozilla Public License 2.0
120 stars 39 forks source link

DPYD in 2.8.0: java.lang.IllegalStateException: STRAND MISMATCH 1 #155

Closed hudja closed 11 months ago

hudja commented 11 months ago

Hello, unfortunately, I now have the following error with 2.8.0 and DPYD:

java.lang.IllegalStateException: STRAND MISMATCH 1
    at org.pharmgkb.pharmcat.haplotype.DpydHapB3Matcher.mergePhasedDiplotypeMatch(DpydHapB3Matcher.java:182)
    at org.pharmgkb.pharmcat.haplotype.NamedAlleleMatcher.callDpyd(NamedAlleleMatcher.java:310)
    at org.pharmgkb.pharmcat.haplotype.NamedAlleleMatcher.call(NamedAlleleMatcher.java:202)
    at org.pharmgkb.pharmcat.Pipeline.call(Pipeline.java:233)
    at org.pharmgkb.pharmcat.PharmCAT.main(PharmCAT.java:166)

My data was preprocessed with the latest (2.8.0) preprocessor. I have multiple such cases, this is just one example (all not-shown positions are REFs):

chr1    97515865    1|0
chr1    97573863    0|1
chr1    97579893    0|1
chr1    97699535    1|0
chr1    97883329    1|1

Originally posted by @hudja in https://github.com/PharmGKB/PharmCAT/issues/150#issuecomment-1732659519

markwoon commented 11 months ago

I cannot reproduce the error based on what you have above.

I'm translating what you have to

chr1    97515865    1|0  -> T|C (rs1801158)
chr1    97573863    0|1  -> C|T (rs56038477)
chr1    97579893    0|1  -> G|C (rs75017182)
chr1    97699535    1|0  -> C|T (rs2297595)
chr1    97883329    1|1  -> G|G (rs1801265)

If those alleles are what you're expecting, then I don't understand why we're not getting the same results and I'm going to need the DPYD section of your VCF so I can reproduce this.

This error would indicate a REF/ALT mismatch somewhere. Although unlikely, maybe you're also missing a position? Are you getting any warnings at all?

Here's the resulting report (unzip for html report): test155.report.zip

Please compare all the alleles in the section 3 of the report, and if any of that does not match what's in your VCF, then I need to know what it is.

hudja commented 11 months ago

Yes, you are translating the codes correctly, but according to the report your test data is unphased. My data is phased. I can confirm that unphased version of this vcf works fine.

This phased vcf body gives an error:

#CHROM  POS ID  REF ALT QUAL    FILTER  INFO    FORMAT  error
chr1    97515865    .   C   T   .   .   .   GT  1|0
chr1    97573863    .   C   T   .   .   .   GT  0|1
chr1    97579893    .   G   C   .   .   .   GT  0|1
chr1    97699535    .   T   C   .   .   .   GT  1|0
chr1    97883329    .   A   G   .   .   .   GT  1|1

test vcf file https://www.dropbox.com/scl/fi/zkf76nqhgrtw3eutv7g2o/test2.vcf.gz?rlkey=lyq1vmnsiauxnet81iwwftqdx&dl=0

java -jar pharmcat-2.8.0-all.jar -vcf test2.vcf.gz -o ./
java.lang.IllegalStateException: STRAND MISMATCH 1
    at org.pharmgkb.pharmcat.haplotype.DpydHapB3Matcher.mergePhasedDiplotypeMatch(DpydHapB3Matcher.java:182)
    at org.pharmgkb.pharmcat.haplotype.NamedAlleleMatcher.callDpyd(NamedAlleleMatcher.java:310)
    at org.pharmgkb.pharmcat.haplotype.NamedAlleleMatcher.call(NamedAlleleMatcher.java:202)
    at org.pharmgkb.pharmcat.Pipeline.call(Pipeline.java:233)
    at org.pharmgkb.pharmcat.PharmCAT.main(PharmCAT.java:166)
java -version
java version "17.0.8" 2023-07-18 LTS
Java(TM) SE Runtime Environment (build 17.0.8+9-LTS-211)
Java HotSpot(TM) 64-Bit Server VM (build 17.0.8+9-LTS-211, mixed mode, sharing)
markwoon commented 11 months ago

Good catch. Investigating...

hudja commented 11 months ago

Hi, I can see a commit referencing this issue, but when do you plan to make a new fixed release? Thank you!

markwoon commented 11 months ago

Released!