globalwordnet / english-wordnet

The Open English WordNet
https://en-word.net/
Other
476 stars 58 forks source link

Generation of escaped XML IDs, escaping of colon #1109

Closed 1313ou closed 1 month ago

1313ou commented 1 month ago

see https://github.com/globalwordnet/english-wordnet/issues/1107

affects 3+1 entries which will change upon regeneration of XML

señor, señora, señorita -> oewn-señor-n, etc (ñ is valid in XML IDs) Capital: Critique of Political Economy -> oewn-Capital-cn-_Critique_of_Political_Economy-n (: is not valid in XML IDs)

1313ou commented 1 month ago

requires changing valid_id = re.compile("^oewn-[A-Za-z0-9_\\-.]*$") in validate.py:181 will do it later