Open dhdaines opened 1 week ago
This documentation for lunr.js is incorrect, for instance: https://lunrjs.com/docs/lunr.Pipeline.html :
An instance of lunr.Index created with the lunr shortcut will contain a pipeline with a stop word filter and an English language stemmer
Here is a minimal example to show the problem, which I think you'll agree is pretty serious:
from lunr import lunr
index = lunr(
ref="id",
fields=["title", "body"],
documents=[
{"id": "1", "title": "To be or not to be?", "body": "That is the question!"}
],
)
print(index.search("What is the question?")) # Should print something, but doesn't!
They are not added, which will definitely cause problems with recall in the case where users add punctuation to their queries.
Unfortunately, this is a bug-compatibility with lunr.js issue: https://github.com/olivernn/lunr.js/blob/aa5a878f62a6bba1e8e5b95714899e17e8150b38/lunr.js#L49
But it should be documented and there should be a documented way to work around it. This is pretty easy: