kkdey / CountClust

A R package for Grade of Membership model and Visualization of counts data:
31 stars 11 forks source link

FitGoM NAN issue #36

Closed poflawless closed 3 years ago

poflawless commented 6 years ago

I used Countclust FitGoM function to estimate the Bayesian factor,while I get the NAN result, which leaded to the structure plot displayed wrong. Code I use and the result are showed :

FitGoM(t(deng.counts), K=2, tol=0.1,path_rda="/home/gaojingjian/chicken/CountCLUST/packages/MouseDeng2014.FitGoM.rda") Fitting a Grade of Membership model (Taddy M., AISTATS 2012, JMLR 22, http://proceedings.mlr.press/v22/taddy12/taddy12.pdf)

Estimating on a 259 samples collection. Fit and Bayes Factor Estimation for K = 2 log posterior increase: 159351162.5, 248355.8, 2501071.2, 6567078.2, 6066702.2, 5268034.4, 4513658.1, 2916710, 2796619.6, 1840268.3, 1426781.1, 1685662.5, 1047829.2, 1018065.5, 1006377, 993747.6, 754146.4, 739223.3, 613357.4, 630303.1, 393266.1, 574906.7, 449511.4, 276379.6, 120095.8, 136846.3, 41855.6, 7677.1, 1025.1, 88.9, 11.3, 1.9, 0.9, 0.3, done. log BF( 2 ) = NA NAN for Bayes factor.

traceback() 3: gzfile(file, "wb") 2: save(Topic_clus_list, file = path_rda) 1: FitGoM(t(deng.counts), K = 2, tol = 0.1, path_rda = "/home/gaojingjian/chicken/CountCLUST/package/MouseDeng2014.FitGoM.rda")

Plus the maptpx was the latest version

kkdey commented 6 years ago

The log Bayes factor being NA should not really impact the structure plot. Also I would recommend using the compGoM function on the topic model output. Does the function actually output the .rda file after the run, and what are the issues with the structure plot that you faced?

poflawless commented 6 years ago

Yes,you are right, the log Bayes factor being NA hasn't affect the structure display. The problem I encountered was due to the differences between linux and windows platform graphical display system. While the question still confuse me was that maptpx tutor told us that bf :An indicator for whether or not to calculate the Bayes factor for univariate K. If length(K)>1, this is ignored and Bayes factors are always calculated. Since I used k=2:7 and official example data , the bayes factor was still NAN. I can't find a reason.

kkdey commented 6 years ago

The maptpx formulation works slightly differently than the formulation here. I understand the confusing part here and may be the best way to address this would be to altogether remove the BF output from maptpx. We don't recommend the use the BF from the maptpx/FitGoM output and would suggest the compGoM() function instead in the mean time.

poflawless commented 6 years ago

Thank you so much for your consistent help and advice, glad to have a friend like you!