Closed jvlmdr closed 4 years ago
Sorry about the large number of commits. When the pull request was merged, the commits were flattened. I only pulled this collapsed-commit after doing the work. If you like, I can try to rebase it?
Opening new pull request with rebased branch
Very cool, Jack! You are driving this project :) I guess we should think about making you a maintainer, since the time I can spend on professionally on this project has become quite limited. Interested?
Thanks! Yep, I think I would be able to do that. Let's talk via email.
I noticed that iteratively selecting rows from the dataframe was a serious bottleneck.
It looks like someone was already investigating this. I have removed the use of the cached analysis and the lines which computed timings.
I isolated the code for extracting counts and added a benchmark (and a dependency on
pytest-benchmark
).Before:
After (time in ms not s):