Closed cbanbury closed 7 years ago
Hi Carl !
Thanks for the feedback, you're right about these points,
I have tried commenting out the normalisation line:
this.data = this.normalize(data, scales);
but still see the same convergence. I'll have a play with different normalisation methods externally.
Regarding the timeout, I think you can set the timeout for one or more tests manually. I've been trying to find some test data, how about using astronomical spectra:
http://cdsarc.u-strasbg.fr/viz-bin/Cat?III/92#sRM2.1
This paper did something similar to classify stellar types using SOM.
Thanks, let me know if you find something!
I'll have a look, I'm sure that will be an interesting test case :)
I was quite busy the past week, but I will be more available for this this week !
waow, 2799 dimensions in the stellar dataset!
Ha, yes it might be a bit overkill for a test, in theory it should still work though. Would be nice to see what the limits are for this kind of thing using JavaScript.
Vectorial operations seem to be the problem (combined with normalized values)... Even with a single iteration, all data are converging to the same neuron because dist
method returns a NaN
... I'm not sure yet
I got it, it's a BIG
mistake in the eigenvectors generation!!
Basically, I generate vectors of dimension N with N the num of my input data, not the num of their dimensions... :ashamed
It was working because :
Basically, I could have randomly initialized my neurons' vectors, it would have been the same...
The convergence on a single neuron occurs as soon as the dimensions cardinality is bigger than the data input cardinality which make the dist method returns NaN
I'm gonna add a decent test coverage on that!
Oops! At least it's a fairly easy fix. 😸
@cbanbury I've finally added an issue on ml-pca
repo: https://github.com/mljs/pca/issues/9 because I was not sure of the behavior of their eigenvectors...
but it was actually my mistake,
After having fixed this, I ran the stars example and results are not that bad for a first attempt, I've begun a visualisation in a dedicated repo: https://github.com/seracio/kohonen-stars (beware, the vis is working but SOM calculation is based on a non released yet version of kohonen
- https://github.com/seracio/kohonen/tree/45-api-redesign)
Awesome stuff! I have a feeling that I've run into a similar issue with the ml-pca
package, so perhaps their docs need more clarity.
The visualisation looks great, and nice to have as an example for using the package.
v0.7.0 is out, it finally only fix this bug, the API redesign will be for v1!
I've been playing with this a bit more and it works well for the canonical example of mapping colours. However, when I feed data with more variables (~40) into the SOM, all of the inputs tend to converge on a single neuron.
You seem to have had this issue before with: #17, I'm wondering if it is again related to normalisation?
Should probably have: