betterenvi / gSpan

Python implementation of frequent subgraph mining algorithm gSpan. Directed graphs are supported.
https://pypi.org/project/gspan-mining/
MIT License
198 stars 90 forks source link

added upper bound for number of generated features #19

Open blumewas opened 3 years ago

blumewas commented 3 years ago

Hi, first of all thank you for the great work! As I mentioned in an earlier issue I used your package for the practical part in my Bachelor Thesis.

I have made some modifications in the program and want to share with other programmers around so that they can hopefully profit from my work.

Modification

As in the gSpan-paper described the experiments use an upper boundary for the max graphs generated. For some datasets I have used, the process seemed endless, especially with low suppport. So I have added the argument -mm <number> or --max_mining <number> to help the algorithm stop the mining process if the count of generated features reaches the <number> passed with the argument.