tno-terminology-design / tev2-tools

The Terminology Engine (v2) is a set of specifications and tools that caters for the creation and maintenance (i.e. curation) of terminologies. This repository contains the sources for the tools.
Apache License 2.0
2 stars 3 forks source link

Make formphrase macros importable/configurable #27

Open RieksJ opened 8 months ago

RieksJ commented 8 months ago

In order to make formphrase macros also useable when terminologies are developed in different languages, it is necessary that they can be specified outside of the source code of the tools. Also, if a curator wants to adjust the macro's, (s)he can then do so. It is also handy for testing new regex candidates.

This issue calls for:

For starters of the specifications, I think the macros should either be specified in a (new) section of the SAF (that doesn't get copied into MRGs), or we could make it a command-line option for the MRGT (so that it can also be listed in the MRGT configuration file).

RieksJ commented 7 months ago

Decisions:

  1. We let go of the idea that all stuff that we can put in the config file of a tool must also be available on the command-line.
  2. The (part of the) config file for MRGT will have a section that allows for specifying (possibly empty) formphrase macros. If a (possibly empty) formphrase macro is specified, it will override the predefined macros, so you can 'adjust', and even 'remove' predefined macros, as well as add your own.
RieksJ commented 7 months ago

@Ca5e:

Ca5e commented 7 months ago

@RieksJ, please check the documentation.

RieksJ commented 7 months ago
RieksJ commented 6 months ago

@Ca5e Can you have a look at the specification of form phrase macro maps, and particularly the section on how they work.

If you are convinced the specifications and the operation of the tools agree, you may close this issue. If not, please comment what the (remaining) issues are.

Ca5e commented 6 months ago

Some things I believe should be looked into...

I'd say form phrases aren't used to refer to a semantic unit, but instead enable a semantic unit to be referred to.

Here is how a form phrase is matched against:

Considering we're using termid to match where possible, I believe this section should be rethought. Within the MRGT there isn't much of a difference between searching in curated text or MRGs either. When the tool first recognizes that the curated texts are supposed to be used, it loads all of the curated texts as a 'normal' list of MRG entries.

RieksJ commented 5 months ago

@Ca5e Thanks for all the comments, which I have used to improve the documentation.

And, Yes, please move back so we can combine these sources.