Data Mining Algorithms
Setup
Linux:
cd data-mining-algorithms
$ sudo ./setup.sh
This will set up pypy3.3, pip3, matplotlib, and numpy.
Apriori Algorithm
To run the algorithm:
$ pypy3.3 Apriori/main.py [filename] [support] [confidence] -d D
- filename: the filename of the dataset [a .csv file]
- support: the minimum support for the apriori algorithm [float]
- confidence: the minimum confidence to mine association rules [float]
- -d D: the delimiter (default: ,) [str]
Example:
$ pypy3.3 Apriori/main.py Datasets/Apriori/apr.fpg.retail.comma.txt 0.5 0.5
K-Means
To run the algorithm:
$ python3 K-Means/main.py [filename] [k] [squared error] [type] -x X -y Y -d D -t
- filename: the filename of the dataset [a .csv file]
- k: the number of clusters you want to generate [int]
- squared error: the termination criterion [float]
- type: the datatype of the data (int/float) [str]
- -x X: the column in the dataset for the x axis (default: 0) [int]
- -y Y: the column in the dataset for the y axis (default: 1) [int]
- -d D: the delimiter (default: ,) [str]
- -t: if the dataset has titles (flag)
Example:
$ python3 K-Means/main.py Datasets/K-Means/airports-titles.dat 6 0.0001 float -t
$ python3 K-Means/main.py Datasets/K-Means/km.comma.txt 4 0.0001 int -d ,