202 table method for clustering data

SUMMARY

Implements necessary changes for #202

Reviewers: @numere-org/maintainers

IMPLEMENTATION

Implementation:
- Added new calltree for table method kmeansof()
- tableMethod_kmeansin dataaccess: Parsing parameters and calling implementation of kmeans in memory.cpp. Parameter n_init is already implemented here. This parameter specifies how often kmeans is called with different start seeds, Result with min inertia is returned.
- getKMeans() in memory.cpp: Implementation of kmeans itself. The algorithm consists of 3 main steps. For initialziation two versions are available, random seeds or kmeans++. After initialization the two steps assign points to clusters and re-calculate cluster centers are iterated till maxIterations or a Stop criteria is reached.
- Following three helper functions are added:
- std::vector<int> getIndices(const std::vector<mu::value_type>& vec, mu::value_type value)
- double calculateL2Norm(const std::vector<mu::value_type>& vec1, const std::vector<mu::value_type>& vec2)
- std::vector<int> getIndices(const std::vector<mu::value_type>& vec, mu::value_type value)

The helper functions could be usefull for other tasks, or may already be implemented somewhere and I did not find them. Tell me if I should refactor something here.

Also, there are some "todo" Comments, please give me some input on these points.

Implementation test: I did create a table with 2 float value Columns to test the algorithm. Since the result of kmeans does depend on the random initialization step the resulting clusters can vary. However with the implementation of n_init resulting clusters are very likely the same.

DOCUMENTATION

[x] ChangesLog updated
[x] Code changes commented
Documentation articles:
- [ ] corresponding documentation articles updated
- [x] new documentation articles created
- [ ] not needed
Language files:
- [x] corresponding language files updated
- [ ] not needed

TESTS BY REVIEWERS

[ ] Added to the automatic SW tests
[x] Tested manually

numere-org / NumeRe

202 table method for clustering data #206

SUMMARY

IMPLEMENTATION

DOCUMENTATION

TESTS BY REVIEWERS