airysen / racog

Rapidly Converging Gibbs sampler
MIT License
4 stars 1 forks source link

RACOG

A Python implementation of Rapidly Converging Gibbs sampler1,2 for data oversampling with CAIM3 and MDLP4 discretization methods

Reference

[1]B. Das, N. C. Krishnan and D. J. Cook, "RACOG and wRACOG: Two Probabilistic Oversampling Techniques,"in IEEE Transactions on Knowledge and Data Engineering,vol. 27, no. 1, pp. 222-234, Jan. 1 2015. doi: 10.1109/TKDE.2014.2324567 http://ieeexplore.ieee.org/document/6816044/

[2] https://github.com/barnandas/DataSamplingTools

[3]https://github.com/airysen/caimcaim

[4]https://github.com/airysen/mdlp

Installation

Requirements:

>>> from racog import RACOG
>>> from sklearn.datasets import make_classification
>>> X, y = make_classification(n_samples=2000, n_features=7, n_redundant=2,
>>>                            n_informative=4, weights=[0.05, 0.95], n_classes=2)

>>> racog = RACOG(discretization='caim', categorical_features='auto',
>>>               warmup_offset=100, lag0=20, n_iter='auto',
>>>               continous_distribution='normal', random_state=None,
>>>               alpha=0.6, L=0.5, threshold=10, eps=10E-5, verbose=2, n_jobs=1)

>>> X_res, y_res = racog.fit_sample(X, y)