USCbiostats / LUCIDus

the new version of LUCID
7 stars 5 forks source link

error: underflow when calculating the log likelihood #14

Closed Yinqi93 closed 2 years ago

Yinqi93 commented 2 years ago

Mai running LUCID with her own data and encountered the issue below

Initialize LUCID with mclust 

iteration 1 : E-step finished.
iteration 1 : Invalid estimates 
Initialize LUCID with mclust 

iteration 1 : E-step finished.
iteration 1 : M-step finished,  loglike =  -Inf 
Error in if (abs(res.loglik - new.loglik) < control$tol) { : 
  missing value where TRUE/FALSE needed

The bug arise from calculating the expected likelihood at the first iteration. I should re-write the likelihood function.

Yinqi93 commented 2 years ago

Fixed. Created pull request c19d4e498b03337fde68fd885d8c78b13436db34

Yinqi93 commented 2 years ago

Another underflow problem: when calculate the likelihood of G->X, the softmax function (with log-sum-exp trick) produces value 0s. When taking log towards these 0s to calculate log likelihood, the log likelihood becomes NaN since log 0 = -Inf

Yinqi93 commented 2 years ago

Rewrite the function to calculate likelihood of G->X. Instead of calculating the normalized probability directly, use the log-sum-exp trick to calculate the sum of log likelihood. 347363d7d6f59dbd4387799ba821f634988845f5