Open aaroncraig10e opened 7 years ago
I've discovered that the problem is in the tokenizer, which does not strip punctuation. Working on a fix now.
@aaroncraig10e Did you land a fix for this?
I ended up making my own package, as there were some other issues that were difficult to fix without changing the interface.
A propos of this — is there a way to get Elasticlunr to handle punctuation, i.e. not strip out all characters?
For example, I've built a Shakespeare search app at shearch.me, but it doesn't cope with queries such as call'd (which are common in texts of this kind).
Certain terms do not produce matches as expected.
For instance, given the following docs:
a search on the term
candlestick
does not produce a hit.I'm guessing this has to do with the stemmer, as in Elasticsearch I've had mixed results using the Porter stemmer.
We are using this library now for a new project, so I am happy to work on a fix to this and send a PR. Just wanted to post the issue here for anyone else having the same issue.