Open yuvalshachaf opened 4 years ago
Heap was used for calculating top-k. If both your k and your data are large, the performance is impacted.
Feel free to have a pull request for how to improve the performance using list. Please also add some benchmarks to help understand the optimizations.
many thanks, what about the recur depth. i cannot run topk for more than 1500 or so
This part can be explained in README and further provide an additional optional parameter for both CLI and library to be tweaked by user.
when running topk for k >= 1000 i get this error. of course i can increase the sys.setrecursionlimit but up to a point
also the topk uses heap which slows down dramatically. using List as for frequent is x5 faster