jonassibbesen / rpvg

Method for inferring path posterior probabilities and abundances from pangenome graph read alignments
MIT License
47 stars 6 forks source link

HST assignment : rpvg assigned two different homo HST instead of expected heterozygous transcripts. #59

Open jjuhyunkim opened 11 months ago

jjuhyunkim commented 11 months ago

Hi Developers!

I have just two haplotypes in my graph, and I assumed that there are three possible combinations of haplotypes from the rpvg results:

  1. Hap1 Hap1 (Homo),
  2. Hap1 and Hap2 (Hetero), and
  3. Hap2 Hap2 (Homo) with expression values.

However, I got several isoforms with a probability of 1 from the same transcript pairing with the same haplotype, rather than different haplotypes as heterozygous. I have attached a screenshot below. image

This is what I expected :

transcript8801.chr6.nnic_H1 transcript8801.chr6.nnic_R1 674 1   12.999715   3.324959    12.999917   3.649044

This is what I got with some transcripts :

transcript8801.chr6.nnic_H1 transcript8801.chr6.nnic_H1 674 1   12.999715   3.324959    12.999715   3.324959
transcript8801.chr6.nnic_R1 transcript8801.chr6.nnic_R1 638 1   12.999917   3.6490449   12.999917   3.6490449

What could be the reason for this discrepancy? If this is due to using a simple graph rather than incorporating multiple haplotypes, do you think manually replacing the values as per my expectations would be a viable solution?