Open jiguanglizipao opened 2 years ago
@jiguanglizipao Sorry for the late reply.
Is it possible to specify string IDs or make them ranked in lexicographical order?
No. String IDs must be in random order due to the data structure. If you want to obtain lex order mapping, you need to construct permutation outside Xcdat.
If not, what is the strategy/order for generating the encoding?
This is because Xcdat (almost randomly) arranges trie nodes in an array based on the double-array scheme and assigns string IDs based on the arrangement.
In Sample usage,
xcdat
produces a string-to-ID encoding which seems to be random and not in lexicographical order. Is it possible to specify string IDs or make them ranked in lexicographical order? If not, what is the strategy/order for generating the encoding?