Closed jahanpd closed 4 years ago
You're absolutely right, it shouldn't be negative and it's a sign of the marginal distribution being estimated poorly. Please see this pull request to see if it fixes the issue: https://github.com/gregversteeg/bio_corex/pull/20 I was planning to merge it in but haven't had time to test it.
Hi Greg,
Thanks for your response. Implementing the clipping was very effective!
I shall close the issue.
Hi. My understanding of the TC is that it is guaranteed to be non-negative. However, I am having some unusual results when combining binary and continuous variables as described in a previous issue.
As an example, when I run the following code:
X = np.array( [[0,0,0,0,4.0], # A matrix with rows as samples and columns as variables. [0,0,0,1,26.0], [0,1,1,0,6.0], [1,0,1,1,30.0]], dtype=int)
layer1 = ce.Corex(n_hidden=2, dim_hidden=2, marginal_description='gaussian', n_repeat=10,verbose=1, seed=1)
layer1.fit(X) # Fit on data
VERBOSE OUTPUT:
... Overall tc: -32.915963736593795
... Overall tc: Overall tc: -5.596481067514456e-07
... Overall tc: -22.582697642198013
... Best tc: -5.596481067514456e-07
I can only conclude that there must be something funny going on when the marginal probabilities are modeled. As previously stated "The way the marginal probabilities are modeled in this case (with mixtures of Gaussians around each binary value) should be equivalent to modeling them as binary."
I also get negative TCs when running certain combinations of binary variables with gaussian marginals turned on.
I'm not sure how to interpret the negative TCs in this context. Any help would be appreciated.
Best wishes