[...] maybe it would be nice with
a browseable list of clf changes between versions. It's cheap
since it can be automatically generated. E.g. if V1 and V2
are the older and later version respectively
First generate the list of language-level inventory changes,
which is (a) lgs(V1) \ lgs(V2)
"The following languages were removed from the language inventory"
Lg; Family(V1, lg); Macro_Area; Comment
Where comment is one of three possibilities:
(i) if moved to bookkeeping: "Spurious see link to lg in V2"
(ii) if promoted to subfamily: "Rendered as subfamily see link to lg in V2"
(iii) if demoted to dialect: "Rendered as dialect see link to lg in V2"
and (b) lgs(V2) \ lgs(V1)
"The following languages were added to the language inventory"
Lg; Family(V2, lg); Macro_Area; Comment
Where comment is one of three possibilities:
(i) if non-existent (at any level) in V1: "Added see link to lg in V2"
(ii) if demoted from subfamily: "Previously rendered as subfamily see link to lg in V2"
(iii) if promoted from dialect: "Previosuly rendered as dialect see link to lg in V2"
For classification rearrangements, for each lg in lgs(V1) intersection
lgs(V2) consider their parent paths p1 and p2 in V1 and V2 respectively.
E.g., the parent path for Yaroame [yaro1235] is (yano1268, nina1239, yano1266).
For each lg where p1 != p2, group on the tuple (p1, p2) and show
Where Family(V1/V2, lg) is the Family of the lg or the string
family1/family2 if they are not the same, reference is just link to
clf reference (in v2), and From-To(p1, p2) can be computed as follows.
Align p1 and p2 by levenshtein distance to get an aligned sequence
a_1 ... a_n where each a_i = (x, y) is a pair of path elements
from p1, p2 or None. (To break ties among alignments with the
same Levenshtein distance, prefer the one with minimal # of substituions.)
From the a_i sequence form the sequence which is
"..." if x == y
"x if y == None
"y" if x == None
"x->y" otherwise
From-To(p1, p2) is then then the comma-separated concatenation of this
latter sequence but with any sequence of consecutive "...":s replaced
by just one "..."
[...] maybe it would be nice with a browseable list of clf changes between versions. It's cheap since it can be automatically generated. E.g. if V1 and V2 are the older and later version respectively
First generate the list of language-level inventory changes, which is (a) lgs(V1) \ lgs(V2)
"The following languages were removed from the language inventory"
Lg; Family(V1, lg); Macro_Area; Comment
Where comment is one of three possibilities: (i) if moved to bookkeeping: "Spurious see link to lg in V2" (ii) if promoted to subfamily: "Rendered as subfamily see link to lg in V2" (iii) if demoted to dialect: "Rendered as dialect see link to lg in V2"
and (b) lgs(V2) \ lgs(V1)
"The following languages were added to the language inventory"
Lg; Family(V2, lg); Macro_Area; Comment
Where comment is one of three possibilities: (i) if non-existent (at any level) in V1: "Added see link to lg in V2" (ii) if demoted from subfamily: "Previously rendered as subfamily see link to lg in V2" (iii) if promoted from dialect: "Previosuly rendered as dialect see link to lg in V2"
For classification rearrangements, for each lg in lgs(V1) intersection lgs(V2) consider their parent paths p1 and p2 in V1 and V2 respectively. E.g., the parent path for Yaroame [yaro1235] is (yano1268, nina1239, yano1266). For each lg where p1 != p2, group on the tuple (p1, p2) and show
"The following languages were moved"
Lg; Family(V1/V2, lg); From-To(p1, p2); Macro_Area; Reference
Where Family(V1/V2, lg) is the Family of the lg or the string family1/family2 if they are not the same, reference is just link to clf reference (in v2), and From-To(p1, p2) can be computed as follows. Align p1 and p2 by levenshtein distance to get an aligned sequence a_1 ... a_n where each a_i = (x, y) is a pair of path elements from p1, p2 or None. (To break ties among alignments with the same Levenshtein distance, prefer the one with minimal # of substituions.) From the a_i sequence form the sequence which is
"..." if x == y "x if y == None
"y" if x == None
"x->y" otherwise
From-To(p1, p2) is then then the comma-separated concatenation of this latter sequence but with any sequence of consecutive "...":s replaced by just one "..."
E.g the paths
(yano1268, nina1239, yano1266) (yano1268, nina1239, yano1266, aaaa1234)
would get From-To: ..., aaaa1234
(yano1268, nina1239, yano1266) (yano1268, yano1266, aaaa1234)
would get From-To: ...,nina1239 , ..., aaaa1234
(yano1268, nina1239, yano1266) (yano1268, nina1239, aaaa1234)
would get From-To: ..., yano1266->aaaa1234