HITS-AIN / PINK

Parallelized rotation and flipping INvariant Kohonen maps
GNU General Public License v3.0
21 stars 11 forks source link

Removed gaussian scaling in neighbour function #49

Closed tjgalvin closed 4 years ago

tjgalvin commented 4 years ago

When using a Gaussian neighborhood function, the current behavior will change the peak as a function of sigma i.e. the area under the curve will be constant across different values of sigma. In practice this can have odd / undesired behavior when scaling the magnitudes of the weighting update process. This can be negated if the learning rate term supplied also includes the appropriate inverse, but this can easily overlooked.

I've edited the GaussianFunctor function to ensure the Gaussian neighborhood curve is always on a 0 -> 1 range. Attached is a small figure showing the behavior. The red set of curves are with the existing implementation, and the blue are with the updated. I've used sigmas ranging from 0.1 to 2. Code to reproduce is below.

Any thoughts?

Gaussian_Update

import numpy as np

def gaussian(x, mu, sig, damp=1.):
    return damp / (sig * np.sqrt(2.*np.pi)) * np.exp(-0.5*((x-mu)/sig)**2.)

def gaussian_nonorm(x, mu, sig, damp=1.):
    return damp * np.exp(-0.5*((x-mu)/sig)**2.)

x = np.linspace(0, 5, 100)

fig, ax = plt.subplots(1,1)

for s in np.linspace(0.1, 2, 10):
    ls, = ax.plot(x, gaussian(x, 0, s, damp=1), 'r-')
for s in np.linspace(0.1, 2, 10):
    nls, = ax.plot(x, gaussian_nonorm(x, 0, s, damp=1), 'b--')

print(ls)
ax.legend([ls,nls], ['Current Curve', 'Updated Curve'],  loc='upper right')
BerndDoser commented 4 years ago

Looks good to me. I would suggest to use a new DistributionFunctor to be backward compatible.

Did you know, that using the python interface it is possible to define your own distribution function very easy. See:

https://github.com/HITS-AIN/PINK/blob/b4bf456b932fb12f80560fbbd6a9133388e78950/scripts/train.py#L18-L27

tjgalvin commented 4 years ago

@BerndDoser I will give it a go and try to add a new DistributionFunctor - I'll do this as a fun exercise over the next few days.

I did see that but have been following the rough processing scheme I had for some earlier work. I am becoming more inclined to use this method though - but I need to confirm it works with my HPC environment and conda set up.

tjgalvin commented 4 years ago

@BerndDoser I've added this as a separate distribution function called UnityGaussian. It appears to compile and run fine for me. Any thoughts?

BerndDoser commented 4 years ago

Thanks Tim! Looks good. Do you have some statistics how the UnityGaussian distribution function improves the training procedure? Can we do something Rafael was doing in https://github.com/HITS-AIN/PINK/issues/33 calculating AQE an TE?

tjgalvin commented 4 years ago

I've got nothing to formally say it is 'better' other than the figure included above and a silly gut feeling. What is essentially happening is that the learning rate (damping in the code) specified by the user is getting perturbed by the sigma term in a non-obvious way, which doesn't feel right.

The easy and obvious thing to do would be to train two SOMs - one with gaussian and one with unitygaussian to see the difference. With a sufficiently large training set it might make very little difference.

tjgalvin commented 4 years ago

Checking against the AQE and TE can be done as well.