buda-base / lucene-bo

Lucene analyzer for Tibetan
Apache License 2.0
12 stars 3 forks source link

CharFilter for Precomposed Tibetan #10

Open eroux opened 7 years ago

eroux commented 7 years ago

Although I'm not sure it's really used (at least not by BDRC), maybe it would be a good thing to allow the analyzer to index precomposed Tibetan. See this page and this one for the mappings. A simple MappingCharFilter should be enough, but that's a lot of entries.