mapping-commons / semantic-mapping-vocabulary

https://mapping-commons.github.io/semantic-mapping-vocabulary/
10 stars 3 forks source link

need to clarify some terms #13

Open saubin78 opened 1 year ago

saubin78 commented 1 year ago

Hi. I feel there is some sources of confusion in the definitions for the semapv elements:

matentzn commented 1 year ago

matching: in semapv, sounds like the activity/process. Should it be by machines only (as is looks like today)? or also by humans? mapping: in semapv, sounds like the result

Yes both are correct, see details of discussion here: https://github.com/mapping-commons/sssom/discussions/169

Note that this is not an attempt to be normative. There is no denying that many people use matching only for machines, and other for humans and machines. Many use mapping as a process. This is just to define how the terms are used in the context of SEMPAV (and by extension, SSSOM).

curation:

  • in semapv, ManualMappingCuration is defined as An matching process that is performed by a human agent and is based on human judgement and domain knowledge. . With this definition, the element should rather be named "ManualMatching" (as is it a process)
  • otherwise, according to Merriam Webster, curation is the act or process of selecting and organizing (something, such as articles or images) for distribution or publication so this means that mappings already exist, i.e. have been computed. This would then be close to the semapv definition for review

This is a great point. I would argue that the "selecting" part can be seen as "selecting an appropriate identifier"; we have been using mapping curation primarily for the task of "finding an appropriate term to map to". Unfortunately, we cant really change the property now at this stage, as it is two widely used and the churn would be enormous to have everyone update their SSSOM files, but if we would have heard your comment earlier in the process, we would have considered it.. Thanks in any case for making that point!

review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. This should be applicable to mappings created by either a human or a machine

Could you please clarify or harmonize the lexicon used?

Not sure I understand, but can you make perhaps concrete suggestions of how we could clarify the definitions to disambiguate better?

saubin78 commented 1 year ago

Taking a look at the specifications' definitions:

matentzn commented 1 year ago

Yes, I agree @saubin78 thanks for this analysis.

Which of the two do you prefer? I personally feel more inclined to having manual mapping curation" as subclass of "matching process". This is more practical IMO. With improving automated methods, the distinction between the two will constantly get less. In the end, the human brain is just a big "pattern matching system". What do you think? cc @graybeal