tno-terminology-design / tev2-tools

The Terminology Engine (v2) is a set of specifications and tools that caters for the creation and maintenance (i.e. curation) of terminologies. This repository contains the sources for the tools.
Apache License 2.0
2 stars 3 forks source link

Generating multiple HRG-entries from a single MRG-entry #29

Closed RieksJ closed 8 months ago

RieksJ commented 9 months ago

It happens that a particular term (e.g., "Human Readable Glossary Tool") has an abbreviation or mnemonic (e.g., "HRGT") that curators may want to include in a HRG, by creating a HRG-entry for the abbreviation, with a text saying something like "See ", e.g., "HRGT: See Human Readable Glossary Tool".

One way to do this is to allow the HRGT, like the TRRT, to use multiple converters, where the behaviour of the HRGT then would be that it would invoke all specified converters for every MRG entry that it treats. Since a HRGT-converter either makes a HRG-entry (or produces an empty string), that the HRGT simply sorts in the end and then concatenates to form the HRG, this would provide users with lots of (additional) flexibility.

This issue calls for implementing the feature of the HRGT using multiple converters, as described above.

Ca5e commented 9 months ago

How do we handle the situation where we have multiple converters specified on the command line, or configuration file, and also find a converter specified within the MRGRef? I believe making the MRGRef interpreter regex able to also detect converter[n] options will require some strange workarounds with named capturing groups. The most straightforward solution would be to ignore all of the other defined converters for the specific glossary and strictly use the one as specified in the MRGRef if it exists. I suppose this limits the use of the MRGRef converter option slightly. It is probably still possible to achieve the requested functionality within a single converter by using a each helper.

RieksJ commented 9 months ago

Here's my understanding:

The MRGref is special, because it adds a new way of specifying stuff as part of a named capturing group of an importer. I do not see a reason why this might (in future) not be generalized further, e.g., that a term-ref interpreter might somehow include a specification of how it should be converted. The fact that this makes the term-ref text messy (similar to the MRGref) would then be something that authors should live with - pay the 'cost' of this messyness to obtain the benefit of such flexibility. Let me be clear: we're not going to do that now, it's just for getting our thougts in order.

It seems to me that if authors take the trouble of specifying messy MRGrefs/TermRefs, that should take precedence over anything else.

Once a tool has decided what the sourcer of a particular (array of) arguments is, it should stick to that. In other words, if the commandline specifies a single converter and the config file specifies multiple ones, then the source is the commandline, and con[2] that is specified in the config file does not get used.

I would say that con[error] is NOT one of the elements of the con array for this matter, but is/should be an argument/parameter in its own right, so that - using the example of the previous paragraph - if the commandline doesn't specify a con[error] while the config file does, then the con[error] as specified in the config file is used where necessary.

I've drafted some text for this here. Pls check/review

@Ca5e does this help?

RieksJ commented 9 months ago

Like the (first) converter, a second (third, ...) converter for the HRGT will be executed for every MRGEntry that the HRGT processes. This means that we can generate multiple HRG-entries from a single MRGEntry, which is beneificial, e.g., if an MRG entry also specifies abbreviations, which can then get their own HRG-entry.

Ca5e commented 9 months ago

For the HRGT, do we want the n in converter[n] mean the count from which to start using a specific converter, or the order in which to use converters? For example, depending on the approach we choose:

converter[1]: {{glossaryText}}
converter[4]: {{glossaryTerm}}

will result in:

This is the glossaryText value of the term Test.
Test

or

This is the glossaryText value of the term Test.
Test
Test
Test
RieksJ commented 9 months ago
RieksJ commented 8 months ago

Done.