chuanconggao / PrefixSpan-py

The shortest yet efficient Python implementation of the sequential pattern mining algorithm PrefixSpan, closed sequential pattern mining algorithm BIDE, and generator sequential pattern mining algorithm FEAT.
https://git.io/prefixspan
MIT License
414 stars 92 forks source link

maximum recursion depth exceeded while calling a Python object #29

Open yuvalshachaf opened 4 years ago

yuvalshachaf commented 4 years ago

when running topk for k >= 1000 i get this error. of course i can increase the sys.setrecursionlimit but up to a point

also the topk uses heap which slows down dramatically. using List as for frequent is x5 faster

chuanconggao commented 4 years ago

Heap was used for calculating top-k. If both your k and your data are large, the performance is impacted.

Feel free to have a pull request for how to improve the performance using list. Please also add some benchmarks to help understand the optimizations.

yuvalshachaf commented 4 years ago

many thanks, what about the recur depth. i cannot run topk for more than 1500 or so

chuanconggao commented 4 years ago

This part can be explained in README and further provide an additional optional parameter for both CLI and library to be tweaked by user.