edwardslab-wustl / dxm

Deconvolution of Methylation Sequencing Data
GNU General Public License v3.0
2 stars 2 forks source link

Hypergeometric term in emission probablity #3

Open nshen7 opened 3 years ago

nshen7 commented 3 years ago

Hello,

I'm trying to understand how the emission probability is implemented in DXM. I can see the binomial distribution of i, which is the number of reads that came from the underlying methylated state, in helper function multinomial; and the beta-binomials are outputted from function translateMeth. But somehow I couldn't find where the hypergeometric term is located. It could also be due to my partial understanding of the source code... Would you point it out to me? Or it is omitted in the implementation?

Thanks, Ning

jredwards417 commented 3 years ago

Hi Ning, The quick answer is that there are not separate binomial, hypergeometric, and beta-binomial terms. They are all implemented in one equation (Equation 1 in the paper). If you have additional questions, please let us know.

nshen7 commented 3 years ago

Hello, thanks for the reply. However, I'm still wondering where exactly Eq.1 was calculated in the software source code?

jredwards417 commented 3 years ago

See the "multinomial" function in DXMfunctions.pyx. Lines 370-381 in particular. The terms are split up differently and the notation is a little different (e.g. "frac1" in the code is "p" in Eq. 1) but the terms are all there. Also, everything is computed in terms of logs to help with precision issues.

nshen7 commented 3 years ago

Hi, Thanks for pointing out the code chunk! In this chunk, I can find the binomial probability stored in variable tune1. The beta-binomials are calculate by translateMeth(0,i,j,refTable,maxSize,METHARRAY,UNMETHARRAY) and translateMeth(1,n-i,k-j,refTable,maxSize,METHARRAY,UNMETHARRAY). But I couldn't locate the hypergeometric probability, did I understand it wrong?