Include digits and update unicode regex generation

The unicode-8.0.0 package has been deprecated for a while. The README also recommends to use regenerate to make regexes, which is much nicer than the way we were doing it before.

But also, a persistent annoyance with lunr-languages was that numbers were missing from wordCharacters in all the Latin and Cyrillic-based languages, while they are present in the default wordCharacters. (also, Indic-Arabic numerals are present for Arabic, Hindi, etc...). So this adds them back, thus fixing #66 and maybe some other bugs.

The problem of the trimmer not being run in the search pipeline persists but that's a lunr.js bug :) at least now things like "HAL9000" wil get indexed.

MihaiValentin / lunr-languages

Include digits and update unicode regex generation #115