metanorma / metanorma-standoc

Metanorma for Standoc documents
BSD 2-Clause "Simplified" License
5 stars 2 forks source link

Intelligently matching `term:[phrase]` to other phrase forms #301

Open ronaldtse opened 4 years ago

ronaldtse commented 4 years ago

e.g. in https://github.com/metanorma/stepmod-utils/issues/23

The directive of term:[individual products] is provided but the actual term is "individual product".

By using word stemmers we could potentially match the words used in the definitions to the defined terms of the document.

For example, ruby-stemmer uses Snowball which converts both the singular "individual product" and "individual products" into "individu product".

There is certainly an issue if the dictionary contains very short phrases like "to" then there is a problem.

There are 2 steps we can do this:

  1. Automatically, by checking all matches in definition text (e.g. definition contains "individual products", we match it to "individual product".
  2. When the user specifies (e.g. term:[individual products] instead of term:[individual products, individual product])
ronaldtse commented 4 years ago

https://github.com/yohasebe/lemmatizer/ may also be an option but it requires a dictionary.