The unicode-8.0.0 package has been deprecated for a while. The README also recommends to use regenerate to make regexes, which is much nicer than the way we were doing it before.
But also, a persistent annoyance with lunr-languages was that numbers were missing from wordCharacters in all the Latin and Cyrillic-based languages, while they are present in the default wordCharacters. (also, Indic-Arabic numerals are present for Arabic, Hindi, etc...). So this adds them back, thus fixing #66 and maybe some other bugs.
The problem of the trimmer not being run in the search pipeline persists but that's a lunr.js bug :) at least now things like "HAL9000" wil get indexed.
The
unicode-8.0.0
package has been deprecated for a while. The README also recommends to useregenerate
to make regexes, which is much nicer than the way we were doing it before.But also, a persistent annoyance with lunr-languages was that numbers were missing from
wordCharacters
in all the Latin and Cyrillic-based languages, while they are present in the defaultwordCharacters
. (also, Indic-Arabic numerals are present for Arabic, Hindi, etc...). So this adds them back, thus fixing #66 and maybe some other bugs.The problem of the trimmer not being run in the search pipeline persists but that's a lunr.js bug :) at least now things like "HAL9000" wil get indexed.