Data Mining Algorithms

Setup

Linux:

cd data-mining-algorithms
$ sudo ./setup.sh

This will set up pypy3.3, pip3, matplotlib, and numpy.

Apriori Algorithm

To run the algorithm:

$ pypy3.3 Apriori/main.py [filename] [support] [confidence] -d D

filename: the filename of the dataset [a .csv file]
support: the minimum support for the apriori algorithm [float]
confidence: the minimum confidence to mine association rules [float]
-d D: the delimiter (default: ,) [str]

Example:

$ pypy3.3 Apriori/main.py Datasets/Apriori/apr.fpg.retail.comma.txt 0.5 0.5

K-Means

To run the algorithm:

$ python3 K-Means/main.py [filename] [k] [squared error] [type] -x X -y Y -d D -t

filename: the filename of the dataset [a .csv file]
k: the number of clusters you want to generate [int]
squared error: the termination criterion [float]
type: the datatype of the data (int/float) [str]
-x X: the column in the dataset for the x axis (default: 0) [int]
-y Y: the column in the dataset for the y axis (default: 1) [int]
-d D: the delimiter (default: ,) [str]
-t: if the dataset has titles (flag)

Example:

$ python3 K-Means/main.py Datasets/K-Means/airports-titles.dat 6 0.0001 float -t
$ python3 K-Means/main.py Datasets/K-Means/km.comma.txt 4 0.0001 int -d ,

theodoretan / data-mining-algorithms

readme

Data Mining Algorithms

Setup

Apriori Algorithm

K-Means