JustGlowing / minisom

:red_circle: MiniSom is a minimalistic implementation of the Self Organizing Maps
MIT License
1.43k stars 420 forks source link

SOM Hyperparameter Tuning #101

Closed aktaseren closed 3 years ago

aktaseren commented 3 years ago

I have a SOM model below applied through an intrusion detection case (unsupervised case) whose dataset is quite big. I selected the suggested parameters based on a paper. However, I require to tune the model but there is no source on how to tune minisom. Can you please suggest me for tuning minisom?

"""Training the SOM"""
from minisom import MiniSom
som31 = MiniSom(x=31, y=31, input_len= 122, sigma= 1.0, learning_rate = 0.5, neighborhood_function='bubble',
              topology= 'rectangular', activation_distance='euclidean')
som31.random_weights_init(X_train_p)
som31.train_random(data = X_train_p, num_iteration = 500)
JustGlowing commented 3 years ago

hi @aktaseren, you have to try different combinations of parameters and pick the one that yields the best score.

Here's some pseudocode:

params = [{'x': 3, 'y':3, 'sigma':1.0}, {'x': 2, 'y':2, 'sigma':0.9}]
for p in params:
    MiniSom(x=p['x'], y=p['y'], sigma=p['sigma])
    # evaluate your error and save it

# pick the combination that minimize the error

You need to treat this as a proper grid search and, if your problem allows it, may want to consider using cross-validation for the estimation of your error.

aktaseren commented 3 years ago

Hi @JustGlowing thanks a lot for a quick reply. I actually applied Bayesian Optimization. I am getting nice quantization errors as tuned. However, the map output is completely dark. I guess that SOM is overfitted. My main question is that in what range the parameters of SOM should be? For example: sigma is generally taken 1 and learning rate is taken between 0 and 1. Do these parameters have any minimum or maximum threshold?

JustGlowing commented 3 years ago

It's hard to give a range for the parameters as they depend on each others. For example, a small map (5x5) can work well with sigma in [1, 3], but with a bigger map you can increase sigma even higher. My suggestion is to find a set of parameters that gives a result that visually makes sense then vary them.

aktaseren commented 3 years ago

Thanks a lot for this.

Last questions: My dataset has 200k rows and 29 columns. One paper suggests that the number of map nodes can be decided over the calculation below:

#Defining 2-Dimensional map size
from math import sqrt
sqrt(5*sqrt(X_train.shape[0]))

Therefore, I set my nodes as x=43 and y=43 and my tuned parameters over these nodes as follows:


# Set Hyperparameters
x = 43
y = 43
input_len = 29
sigma = 1
learning_rate = 0.5
neighborhood_function = 'bubble'
iterations = 500

What do you think regarding the node calculation method? I actually have some doubts about this. However, only one paper is suggesting this without any concrete proof. Maybe this affects the quality of SOM map output.

JustGlowing commented 3 years ago

That way to determine the size of the map is well known, but it's just a rule of thumb. The best size depends on how your data is distributed and you can find it trying different sizes and checking which one fits your data.

aktaseren commented 3 years ago

Thanks very much for this

JustGlowing commented 3 years ago

You can now use this dashboard to explore the effects of the parameters on a sample dataset: https://share.streamlit.io/justglowing/minisom/dashboard/dashboard.py