We're having an issue with pattern==3.6 where if there are duplicates, etc in the model documents, getting the nsmallest fails for vector_space_search:
from pattern.en import lexeme
from pattern.vector import Document, LEMMA, TFIDF, Model
responses = ['it is works great. ', 'bristles are soft and compact enough', 'the aftertaste isnt as bad as others. ', 'i dont know. it isnt something i think about.', 'bristles are soft and compact enough']
exclude = ['t', 'im']
docs = [Document(response, stemmer=LEMMA, name=str(i), exclude=exclude, stopwords=False) for i, response in enumerate(responses)]
m = Model(documents=docs, weight=TFIDF)
results = m.search(words=lexeme('bristle'), top=100)
We're having an issue with pattern==3.6 where if there are duplicates, etc in the model documents, getting the nsmallest fails for vector_space_search:
Results in:
(if you're wondering, here's why it works in py2 - from https://docs.python.org/2/library/stdtypes.html#comparisons)