Open sho-87 opened 5 years ago
Your question is not really clear to me. Al the nodes have their own weight vectors no matter if they are bmus or not. Therefore, we have component planes visualization.
If you want to visualize the values of the data points, you need to write your own plot. Just convert the bmus to xy values and then use some sort of scatter plots for the training data.
When there are nodes between bmu nodes, this means there is a clear cluster border and those nodes are in those borders. Therefore, reducing the som size to 3x3 is not a good idea.
i have the same error for the real value component heatmaps in the example notebook. Using my own data, I am not able to reshape. Our error comes from this example here:
it will be good if someone can unblock us so we can use plot the real component heatmap as well
@sevamoo taking a step back...generally the problem is that the reshaping you used in the notebook (as posted by @germayneng) doesn't work on all data, even though from the looks of it it should be a technique that can be generalized to any data
what do you think might be happening thats causing the reshaping errors, if it isnt the unique bmu count problem I described in the original question?
To see selected components:
# If one wants to visualize components maps of some selected variable
vars = ['Temp','Consumo','WorkingDays']
Nodes=pd.DataFrame(som_chosen._normalizer.denormalize_by(
som_chosen.data_raw,som_chosen.codebook.matrix), columns=Labels)
Nodes_with_selected_variables = Nodes[vars]
import matplotlib
matplotlib.rcParams.update({'font.size': 8})
from sompy.visualization.plot_tools import plot_hex_map
plot_hex_map(np.flip(Nodes_with_selected_variables.values.reshape(som_chosen.codebook.mapsize +
[Nodes_with_selected_variables.values.shape[-1]]),axis=0),
titles=Nodes_with_selected_variables.columns, shape=[1, 3], colormap=None)
Since in my data I have BMU with 0 hits
I had to make a different code
# Recurring to the data normalized and to the location of the nodes, normalized we may find the nearest neighbours of
# each node by making the data normalized as the training data in order to empirically create the values for the exogenous
# variables
Nodes_normalized = pd.DataFrame(som_chosen.codebook.matrix, columns=Labels) # The location of the nodes normalized
Data_normalized = pd.DataFrame(som_chosen._data,columns=Labels) # The data normalized for nearest neighbours
from sklearn.neighbors import NearestNeighbors
Knearmodel = NearestNeighbors(n_neighbors=5) # 5 nearest neighbours with minkowsky power 2
Knearmodel.fit(Data_normalized.values)
TotalMinutesDay_regression = []
for i in range(len(Nodes)): # impute values for empirical TotalMinutesDay
distances, indices = Knearmodel.kneighbors([Nodes_normalized.loc[i].values])
closest_vectors_to_node = Cargadata_relevant.loc[indices[0]]
regression = np.mean(closest_vectors_to_node.TotalMinutesDay)
TotalMinutesDay_regression = np.append(TotalMinutesDay_regression,regression)
Nodes_with_selected_variables["TotalMinutesDay"] = TotalMinutesDay_regression
# Plot hex_map with all variables needed
plot_hex_map(np.flip(Nodes_with_selected_variables.values.reshape(som_chosen.codebook.mapsize +
[Nodes_with_selected_variables.values.shape[-1]]),axis=0),
titles=Nodes_with_selected_variables.columns, shape=[1, 4], colormap=None)
With k nearest neighbours I can attribute a coordinate of the variable not used to create the som.
NameError: name 'som_chosen' is not defined
Dear akol67,
The code posted is only to be analysed. Out of context doesn't work. som_chosen is the som I've trained for my example.
NameError: name 'som_chosen' is not defined
In my context, the variables Temp, Consumo, WorkingDays and TotalMinutesDay are exogenous and the only way I think is possible to construct the exogenous components map is by applying the mean of the nearest neighbours of a given neuron. Even if you make value=1 where BMU hit is zero, when constructing the exogenous components map, you would still have the problem that some neurons(nodes) would not be BMU of any data, so you cannot use sompy codes which retrieve the values of the exogenous variables from the data related to that neuron.
Dear Ricardo,
It worked!
vars = ['Temp','Consumo','WorkingDays'] Nodes=pd.DataFrame(som_chosen._normalizer.denormalize_by( som_chosen.data_raw,som_chosen.codebook.matrix), columns=Labels) Nodes_with_selected_variables = Nodes[vars]
Em seg., 17 de mai. de 2021 às 12:09, ricardomourarpm < @.***> escreveu:
NameError: name 'som_chosen' is not defined
Dear akol67,
The code posted is only to be analysed. Out of context doesn't work. som_chosen is the som I've trained for my example.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/sevamoo/SOMPY/issues/94#issuecomment-842403683, or unsubscribe https://github.com/notifications/unsubscribe-auth/AI3MX6QNIBFN3TRXIQ4LTV3TOEWUPANCNFSM4GF4A3CA .
-- Att Alexandre Kolisnyk
I'm following the AirFlights example using my own dataset but run into a problem when I try to plot the real values component plane
I get a reshape error when I reach the following code:
I checked the values in
df['bmus']
. The max bmu value is 99, but there are only 68 unique bmu values. So this means I have too little data for my mapsize because some map cells are not being chosen as the bmu, and those cells are therefore not represented indf['bmus']
. This is confirmed by plotting the hitsmap, which shows a number of cells with 0 hitsI can fix this by reducing my map size, but it only works if I go down to 3x3, which is pretty pointless
Is there any other way to get around this problem?
View2D
allows me to plot the prototype planes with no problem using the same model, despite some cells not being chosen as the best unit. Any way to do this with the real value planes as well? Maybe fill in the missing bmu values somehow?