Open jmccrae opened 2 months ago
Repeatedly calling add_doc leads to very poor performance due to check for non-duplicate document ID.
add_doc
This seems to be due to Corpus.doc_ids being regenerated every time it is called.
Corpus.doc_ids
Repeatedly calling
add_doc
leads to very poor performance due to check for non-duplicate document ID.This seems to be due to
Corpus.doc_ids
being regenerated every time it is called.