lucaong / minisearch

Tiny and powerful JavaScript full-text search engine for browser and Node
https://lucaong.github.io/minisearch/
MIT License
4.74k stars 135 forks source link

Exlude newline characters #14

Closed samuelmeuli closed 5 years ago

samuelmeuli commented 5 years ago

Shouldn't newline characters (\n and \r) also be part of the tokenizer regex?

Otherwise, words at the start of a new line won't be matched by a search since the preceding newline chars are added to the start of that term.

lucaong commented 5 years ago

That sounds right. I am traveling now, but as soon as I am back I will add a test case for that, and if I confirm the bug I will produce a fix.

Thanks a lot for reporting this

lucaong commented 5 years ago

Although, there is already a test case for this, and it seems to pass fine: https://github.com/lucaong/minisearch/blob/master/src/MiniSearch.test.js#L453

I will double check, but newline characters should be already part of the tokenizer regexp ranges.

samuelmeuli commented 5 years ago

I think that's because in your test case, there are spaces (the indentation) between the newline character and the word you're searching for.

I can work on a PR if you'd like :)

lucaong commented 5 years ago

Sure, a PR would be awesome :)

lucaong commented 5 years ago

I will create a new release including your fix later today. Thanks a lot for the PR!

samuelmeuli commented 5 years ago

Awesome, thank you!