Closed Manishearth closed 6 days ago
@sffc looking at baked data, we only use 21 numbering systems. Should we set up datagen to only generate numbering systems found in DecimalSymbolsV1, or all possible systems? I suspect the latter is more robust and follows our data principles.
numsys: tinystr::tinystr!(8usize, "adlm")
numsys: tinystr::tinystr!(8usize, "arab")
numsys: tinystr::tinystr!(8usize, "arabext")
numsys: tinystr::tinystr!(8usize, "beng")
numsys: tinystr::tinystr!(8usize, "deva")
numsys: tinystr::tinystr!(8usize, "gujr")
numsys: tinystr::tinystr!(8usize, "guru")
numsys: tinystr::tinystr!(8usize, "hanidec")
numsys: tinystr::tinystr!(8usize, "java")
numsys: tinystr::tinystr!(8usize, "khmr")
numsys: tinystr::tinystr!(8usize, "knda")
numsys: tinystr::tinystr!(8usize, "laoo")
numsys: tinystr::tinystr!(8usize, "latn")
numsys: tinystr::tinystr!(8usize, "mlym")
numsys: tinystr::tinystr!(8usize, "mymr")
numsys: tinystr::tinystr!(8usize, "nkoo")
numsys: tinystr::tinystr!(8usize, "olck")
numsys: tinystr::tinystr!(8usize, "orya")
numsys: tinystr::tinystr!(8usize, "tamldec")
numsys: tinystr::tinystr!(8usize, "telu")
numsys: tinystr::tinystr!(8usize, "thai")
Should we set up datagen to only generate numbering systems found in DecimalSymbolsV1, or all possible systems? I suspect the latter is more robust and follows our data principles.
Since these are data marker attributes, it is consistent with data principles to make this configurable in datagen. Datagen should be capable of both. It should be capable of both when we add support for filtering by attributes (do we have that landed yet?).
I think the default can be "auto", or select all DecimalDigitsV1 that are reachable from DecimalSymbolsV2.
We only have a couple numbering systems. DecimalSymbolsV2 should name the system and we can load it from a DecimalDigitsV1 key.