K-Means Clustering Algorithm Based on Improved Cuckoo Search Algorithm and Its Application

:page_with_curl: Abstract(본문)

Because the K-Means algorithm is easy to fall into the local optimum and the Cuckoo search (CS) algorithm is affected by the step size, this paper proposes a K-Means clustering algorithm based on improved cuckoo search (ICSKmeans). The algorithm is compared with the original Kmeans, the Kmeans algorithm based on particle swarm optimization (PSO-Kmeans) and the K-Means algorithm based on the cuckoo search (CS-Kmeans). The experimental results show that the proposed algorithm can obtain better clustering effect, faster convergence rate and better accuracy rate through the experimental test on the UCI standard data set. The algorithm is also applied to the clustering of the characteristic parameters of the heart sound MFCC. The results show that a better clustering center can be obtained, the algorithm converges fast.

:bulb: 방법은 무엇입니까?

k-평균군집화의 과정에 CS를 접목시켜서 수행합니다.
CS에서 Step size A와 Probability of discovery Pa가 정확도에 중요한 영향을 미치는데, Step size가 길수록 정확도가 떨어지고, 짧을수록 수렴율이 하락하는데, iteration을 결합해서 Step size가 초기에는 길게, 후반부에는 짧게 형성되도록 합니다.

:chart_with_upwards_trend: 실험과 그 결과는 어떻습니까?

UCI의 대표적인 4가지 데이터 셋(Iris, Wine, Seeds, Haberman)을 사용해서 오차제곱합을 이용해 정확성을 검증했습니다.
추가적으로 13차원의 벡터를 가진 255개의 심장소리 데이터에 대해서도 적용했습니다.
모든 데이터 셋에서 제안된 알고리즘이 조금 더 나은 성능을 내었습니다. 심장소리 데이터에서는 확연한 성능을 보여주었습니다.

:open_file_folder: 차후 연구방향 및 보완점은 무엇입니까?

차후 연구방향은 아니지만, 제가 개발한 RFS를 이런식으로 클러스터링에 적용해볼 수 있을 것 같다는 생각이 들어습니다.
좀 더 복잡하고 고차원적인 데이터 셋을 사용했다면 결과가 드라마틱하지 않았을까 생각해봅니다.

:thumbsup: novelty와 논문을 통해 배운 것은 무엇입니까?

Step size와 Iterations을 결합한 식을 Step size에 적용하여 수렴속도와 정확성을 둘 다 잡았습니다.
어떤 식으로 Metaheuristic이 군집화에 적용되는지 볼 수 있었습니다.

koptimizer / my_PaperLog