After exploring different options for tibetan collation in JavaScript, there seems to be no viable option to sort Unicode Tibetan strings. This library hopes to fullfill this purpose in an elegant, modern and efficient manner.
The most logical option to sort Tibetan would by using Intl.Collator. The problem is that all browsers seems to use ICU to implement this object, and ICU has a bug on Tibetan collation, which won't be fixed in the short term. It will take even more time for the fix to appear in mainstream browsers, so it's not even a middle term solution. Bugs have been filled for Firefox, ChakraCore, Chrome and Safari.
Pure Javascript implementations of Intl.Collator
don't seem to exist, as the only Intl
polyfill doesn't support it.
The only library we found that would be of possible use is lasca, but it proved very buggy and extremely inefficient.
This implementation aims at being very efficient, at the cost of difficult corner cases in Tibetan. As a consequence:
\u0F77
is not treated like \u0FB2\u0F71\u0F80
)&ཀར<ཀརྐ
is too difficult to handle)yarn add tibetan-sort-js --save
Compares two strings in Tibetan Unicode, can be used as argument of Array.compare(). The behavior is undefined if the arguments are not strings. Doesn't workswell with non-Tibetan strings.
Parameters
Returns number 0 if equivalent, 1 if a > b, -1 if a < b
Compares two strings in EWTS, has the same argument and return value as compare
. The function only works on customary EWTS and doesn't handle oddly encoded cases such as b.r+g+ya
(instead of brgya
).
See change log.
The code is Copyright 2017-2019 Buddhist Digital Resource Center, and is provided under the MIT License.