we need to be much clearer about the different normalization stages. So far we have:
NFD normalization is handled even before any other normalization
source-target normalization (h₂/ə is interpreted as ə, following a feature in EDICTOR/LingPy which in LingPy is triggered with the keyword "cldf", although it is not yet officially documented in the Segments specification, as this needs to wait for lexibank and CLTS as well)
lookalike-normalization (replacement of unique characters in NFD by unique ones in our normalize.tsv list)
Additionally, we have aliases (also some kind of normalization), and automatically generated sounds. Our Sound objects should ideally reflect these stages, otherwise it will be difficult to trace what is going on with them.
we need to be much clearer about the different normalization stages. So far we have:
h₂/ə
is interpreted asə
, following a feature in EDICTOR/LingPy which in LingPy is triggered with the keyword "cldf", although it is not yet officially documented in the Segments specification, as this needs to wait for lexibank and CLTS as well)Additionally, we have aliases (also some kind of normalization), and automatically generated sounds. Our Sound objects should ideally reflect these stages, otherwise it will be difficult to trace what is going on with them.