This is not a main bottleneck, but a simple fix should improve the part in thresholds.py
Here's the diff - sorry for inconvenience :)
diff --git a/thresholds.py b/thresholds.py
index 739d869..1524e69 100644
--- a/thresholds.py
+++ b/thresholds.py
@@ -24,13 +24,8 @@ class Threshold(Enum):
"""Returns a list of ScoredPosts that meet this Threshold with the given Scorer"""
all_post_scores = [p.get_score(scorer) for p in posts]
- threshold_posts = [
- p
- for p in posts
- if stats.percentileofscore(all_post_scores, p.get_score(scorer))
- >= self.value
- ]
-
+ q = stats.scoreatpercentile(all_post_scores, per=self.value)
+ threshold_posts = [p for p, s in zip(posts, all_post_scores) if s >= q]
return threshold_posts
This is not a main bottleneck, but a simple fix should improve the part in thresholds.py
Here's the diff - sorry for inconvenience :)