MRG Refs to include HRGT options (at least mutliple converters)

RieksJ commented 9 months ago

HRGs are typically specified by an MRGRef. Currently, its syntax accepts only a limited set of configuration options.

This issue requests

[ ] to implement one of the following MRGRef syntax changes:
- [ ] (preferred option): modify MRGRef syntax to: {% hrgt="<tid>" <other args> %}, where
  - hrgt replaces the old hrg, suggesting that here the HRGT is called (this could be a start for having other tools be called from 'within' documents at a later stage);
  - <tid> is a terminology identifier, as it currently already is;
  - <other args> could basically be interpreted as if it were the command-line for the HRGT for the creation of this HRG. It would allow, for example, the specification of a converter-array, a sorter, etc.
- [ ] (if the preferred option is too difficult): modify MRGRef syntax to at least accept multiple converters.
[ ] syntax changes have been documented in the tev2-specifications repo;
[ ] the TEv2-glossary page page works.

Ca5e commented 8 months ago

This issue indeed motivates thinking about the use of the {% hrgt="<tid>" <other args> %} syntax to call other tools as well. In my opinion, the scanning of documents for inclusions of these specific syntaxes should not be part of the to-be-called tool, but instead be handled by some other tool that has calling other tools as its main task (I would say "calling other commands", but this seems very risky from a cross site scripting perspective). If we let the to-be-called-tool scan documents, I suspect a less ideal situation where all of the tools require a large amount of files to be processed.

About the use of 'other args'. This basically boils down to deciding at what level we want to strictly define the method used to interpret the syntax. Right now, only predefined named capturing groups of the HRGT interpreter are used. Within the {% %} syntax, I do like the html-esque way of recognizing the hrg, converter, and sorter properties. The use of named capturing groups sadly stops working when we allow converters to be specified with an unknown number n. I believe two solutions are at play. One involves using a syntax similar to {% hrgt="<tid>" cmd="--converter[2] <converter>" %}. The advantage here being that we can still use the same approach as before (using the regex to find named capturing groups), but add a named capturing group that is forced to behave in the same way as the command line interface (which does add quite a bit of complexity). The second solution, looks like the following {% hrgt tid="<tid>" converter[2]="<converter> "%}, which means forcing the use of html-esque parameter use. The main advantage being the more clear notation that does not mix different notation styles. This dicussion is caused by the fact that we don't have dynamic (named) capturing groups. A way to achieve this would be to artificialy expand the group nodes of the regex in the AST. For instance, using an interpreter regex like {%\s*hrgt\s*((?<key>.+)="(?<value>.+)" )\s*%} to only define the format, and not necessarily the names of the groups. Within the AST the unnamed capturing group that includes the two named capturing groups key and value would be repeated a large amount of times, where (in this case) all of the repeating unevenly numbered capturing groups are used as keys and all the evenly numbered capturing groups are used as value. This seems cool, should work in a lot of situations, but is (maybe too) difficult to explain.

RieksJ commented 7 months ago

I close this issue as (a) there is no problem statement/bug report that warrants the issue's existence, and (b) discussions about how to deal with MRGRefs are also considered in #25 (which for now is the appropriate place)

tno-terminology-design / tev2-tools

MRG Refs to include HRGT options (at least mutliple converters) #35