itdxer / neupy

NeuPy is a Tensorflow based python library for prototyping and building neural networks
http://neupy.com
MIT License
741 stars 160 forks source link

Relate data points to clusters after clustering with GNG #278

Closed Rbasarat closed 3 years ago

Rbasarat commented 3 years ago

Hi,

I am doing some research where I cluster data with the GNG. For this research I need to know which data points are in which clusters. Is it possible to check which data point is in which cluster after training the GNG?

itdxer commented 3 years ago

Hi @Rbasarat,

Current implementation doesn't store this information, but it can be calculated with a few lines of code. You can get location of the nodes and find closest neuron per each sample individually with l2 distance.

from neupy import algorithms
import numpy as np
from scipy.spatial import distance_matrix

gng = algorithms.GrowingNeuralGas(...)
gng.train(X_samples)

# The main logic
gng_nodes = np.concatenate([node.weight for node in gng.graph.nodes])
closest_node_index_per_sample = np.argmin(distance_matrix(X_samples, gng_nodes, p=2), axis=1)

(I haven't check the code, so I'm not 100% that it won't fail, but at least I hope the idea will be clear)