The current data structures used for terminology data are not easy to understand/work with: Currently there are three list elements that are iterated over in alignment, called "accepts" (proposed replacements and information about usage context), "patterns" (groups of main search patterns), "contextpatterns" (groups of patterns that are matched around the main search patterns).
[More detail -> see def buildtermdata in __init__.py.]
This "iterating in alignment" thing is somewhat hairy and easy to get wrong (even though we do seem to get it right currently -- which is good) and it is in particular not maintainable.
Ideas to improve that area:
OO: we could provide an instance of a class for each <pattern> element (but I would need to get a grip on this OO topic somehow to better see if that is useful)
(sorted) dictionaries: dictionaries provide keys which would allow making these structures at least a bit more accessible. the "collections" module provides ordered dictionaries which would be helpful because in enough cases, the ordering of <pattern> elements is (unfortunately) meaningful.
The current data structures used for terminology data are not easy to understand/work with: Currently there are three list elements that are iterated over in alignment, called "accepts" (proposed replacements and information about usage context), "patterns" (groups of main search patterns), "contextpatterns" (groups of patterns that are matched around the main search patterns). [More detail -> see
def buildtermdata
in__init__.py
.]This "iterating in alignment" thing is somewhat hairy and easy to get wrong (even though we do seem to get it right currently -- which is good) and it is in particular not maintainable.
Ideas to improve that area:
<pattern>
element (but I would need to get a grip on this OO topic somehow to better see if that is useful)<pattern>
elements is (unfortunately) meaningful.