juselara1 / dmae

TensorFlow implementation of the Dissimilarity Mixture Autoencoder: https://arxiv.org/abs/2006.08177
MIT License
11 stars 4 forks source link

Strange covariance matrices #3

Open VolodyaCO opened 3 years ago

VolodyaCO commented 3 years ago

So, I understand that you are learning the inverse covariance matrix. You do so by defining a completely trainable matrix X, and then you obtain the inverse of the covariance matrix as S = X @ X.T. So, in a way, you are using a factorisation of the covariance matrix inverse.

Now, I am using the following code to plot the ellipses learnt by the DMAE method on the dataset presented in issue #1:

def plot_results(X, means, covariances):
    for i, (mean, covar) in enumerate(zip(
            means, covariances)):
        # covar = np.linalg.pinv(np.matmul(covar, covar.T))  # <<<< note this!, I'm commenting it!
        v, w = np.linalg.eigh(covar)
        v = 2. * np.sqrt(2.) * np.sqrt(np.abs(v))
        u = w[:, 0]

        # Plot an ellipse to show the Gaussian component
        angle = np.arctan(u[1] / u[0])
        angle = 180. * angle / np.pi  # convert to degrees
        ell = mpl.patches.Ellipse(mean, v[0], v[1], 180. + angle, color='green')
        ell.set_clip_box(plt.gca().bbox)
        ell.set_alpha(0.5)
        plt.gca().add_artist(ell)

    # Plot the data
    plt.scatter(*X.T, .2, color='black')
    plt.scatter(*(X + np.array([1, 0])).T, .2, color='blue')
    plt.scatter(*(X + np.array([0, 1])).T, .2, color='blue')
    plt.scatter(*(X + np.array([-1, 0])).T, .2, color='blue')
    plt.scatter(*(X + np.array([0, -1])).T, .2, color='blue')
    plt.xlim([-1, 2])
    plt.ylim([-1, 2])
    plt.show()

The plot generated when inputting the means and covariances using plot_results(X, model2.layers[1].get_weights()[0], model2.layers[1].get_weights()[1]) yields a very good result!

image

However, covar IS NOT the covariance matrix that defines the gaussian, as it should... The correct way of doing this is to uncomment the line that I point out in the code. When I do this, things go wrong!

Also, the visualisations using vis_utils are this:

image

image

which is nonsense. I wonder what I'm doing wrong, or if the visualisation utils are somehow mistaken.

VolodyaCO commented 3 years ago

BTW, the covar matrices that seem to yield good gaussians are not positive semidefinite, as they have negative eigenvalues!

juselara1 commented 3 years ago

It seems like an overfitting problem here:

image

Can you share how is the DissimilarityMixtureAutoencoderCov initialized or the complete notebook/code?

Maybe if you use less epochs and a smaller learning rate the problem can be solved.

I'll try to replicate it anyways, maybe it could be a problem with the covariance parameters in the tensorflow implementation.

VolodyaCO commented 3 years ago

A mate of mine and I have been writing this library in Pytorch. We found a couple of strange things with the covariance matrix, but we make sure that the covariance matrix is positive semidefinite. At the moment we have not yet tested the periodic mahalanobis case, but at least we have had some interesting discussion about how the shape of the covariance matrices are affected by training.

We will continue to develop the library, if you want to take a look at it you can give me your gitlab account and I will invite you!

juselara1 commented 3 years ago

Sure, my username in gitlab is juselara.

I'll really appreciate to check the improvements over the covariance matrices and implement them to the tensorflow version.

Thanks