We currently use (at least) three separate, but very similar schemes for parsing XML tags. For appendices, we have a single class, AppendixProcessor, which steps through the tags within an APPENDIX and calls a method for each. At the end of each "group", it runs the depth-derivation code we use elsewhere. For regulation sections, we have ParagraphProcessor (and its subclasses), which have a list of tag-matching-and-processing children. This steps through the tags, finding the first applicable tag-matcher, then running depth derivation at the end of the process. At a higher level, we select between SECTION, SUBPART, etc. processors through a system of plugins, where the top-level PART's children are compared to the plugin tag matchers.
Each of these styles has slightly different mechanics. We should unify the best ideas from this slowly-evolving system:
Have a set of functions (or classes) which each handle one type of tag. They should return a list of parsed sub-elements
Define these as plugins
Design glue code that knows how to automatically find the plugins relevant to a particular XML tag's children
Use depth derivation once a parent processor has its list of children
Make use of some sort of ordering scheme for these plugins
Write preprocessing plugins to adjust the XML if it's nested in a way we don't like
We currently use (at least) three separate, but very similar schemes for parsing XML tags. For appendices, we have a single class,
AppendixProcessor
, which steps through the tags within anAPPENDIX
and calls a method for each. At the end of each "group", it runs the depth-derivation code we use elsewhere. For regulation sections, we haveParagraphProcessor
(and its subclasses), which have a list of tag-matching-and-processing children. This steps through the tags, finding the first applicable tag-matcher, then running depth derivation at the end of the process. At a higher level, we select betweenSECTION
,SUBPART
, etc. processors through a system of plugins, where the top-levelPART
's children are compared to the plugin tag matchers.Each of these styles has slightly different mechanics. We should unify the best ideas from this slowly-evolving system: