EASE (Enhanced AI Scoring Engine) is a library that allows for machine learning based classification of textual content. This is useful for tasks such as scoring student essays.
GNU Affero General Public License v3.0
216
stars
96
forks
source link
Use a set instead of a list for good ngram lookup #59
When finding grammar errors, EASE checks whether each ngram in the submission is in a list called good_pos_ngrams. This is an O(n) operation that occurs for each ngram in the submission.
By changing the list to a set, the in operation is O(1) in the average case, which results in a pretty dramatic speedup. When I profiled the algorithm on my laptop, I saw an improvement of about 71%.
@stephensanchez Please review. I'd like to get this in and re-run the perf test on dev.
When finding grammar errors, EASE checks whether each ngram in the submission is in a list called
good_pos_ngrams
. This is an O(n) operation that occurs for each ngram in the submission.By changing the
list
to aset
, thein
operation is O(1) in the average case, which results in a pretty dramatic speedup. When I profiled the algorithm on my laptop, I saw an improvement of about 71%.@stephensanchez Please review. I'd like to get this in and re-run the perf test on dev.