NaazS03 / cgSpan

Python implementation of closed frequent subgraph mining algorithm cgSpan. Only undirected graphs are currently supported.
https://pypi.org/project/cgspan-mining/
MIT License
9 stars 5 forks source link
closed-graph data-mining graph-algorithms graph-mining gspan gspan-algorithm mining-frequent-subgraphs

cgSpan

cite contains implementation for our paper. If you find this code useful in your research, please consider citing:

@misc{shaul2021cgspan,
  title={cgSpan: Closed Graph-Based Substructure Pattern Mining}, 
  author={Zevin Shaul and Sheikh Naaz},
  year={2021},
  eprint={2112.09573},
  archivePrefix={arXiv},
  primaryClass={cs.AI}
}

cgSpan is an algorithm for mining closed frequent subgraphs. This implementation of cgSpan is built using an existing implementation for gSpan.

gSpan is an algorithm for mining frequent subgraphs.

This program implements cgSpan with Python. The repository on GitHub is https://github.com/NaazS03/cgSpan

The gSpan implementation referenced by this program can be found on GitHub at https://github.com/betterenvi/gSpan.

Undirected Graphs

This program supports undirected graphs.

How to install

This program supports Python 3.

Method 1

Install this project using pip:

pip install cgspan-mining
Method 2

First, clone the project:

git clone https://github.com/NaazS03/cgSpan.git
cd cgSpan

You can optionally install this project as a third-party library so that you can run it under any path.

python setup.py install

How to run

The command is:

python -m cgspan_mining [-s min_support] [-n num_graph] [-l min_num_vertices] [-u max_num_vertices] [-v True/False] [-p True/False] [-w True/False] [-h] database_file_name 
Some examples
python -m cgspan_mining -s 5000 ./graphdata/graph.data
python -m cgspan_mining -s 5000 -p True ./graphdata/graph.data
python -m cgspan_mining -h

Reference

cgSpan: Closed Graph-Based Substructure Pattern Mining, by Zevin Shaul and Sheikh Naaz

CloseGraph: Mining Close Frequent Graph Patterns, by X. Yan and J.Han.

gSpan: Graph-Based Substructure Pattern Mining, by X. Yan and J. Han. Proc. 2002 of Int. Conf. on Data Mining (ICDM'02).