need to clarify some terms

saubin78 commented 1 year ago

Hi. I feel there is some sources of confusion in the definitions for the semapv elements:

matching: in semapv, sounds like the activity/process. Should it be by machines only (as is looks like today)? or also by humans?
mapping: in semapv, sounds like the result
curation:
- in semapv, ManualMappingCuration is defined as An matching process that is performed by a human agent and is based on human judgement and domain knowledge. . With this definition, the element should rather be named "ManualMatching" (as is it a process)
- otherwise, according to Merriam Webster, curation is the act or process of selecting and organizing (something, such as articles or images) for distribution or publication so this means that mappings already exist, i.e. have been computed. This would then be close to the semapv definition for review
review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. This should be applicable to mappings created by either a human or a machine Could you please clarify or harmonize the lexicon used?

matentzn commented 1 year ago

matching: in semapv, sounds like the activity/process. Should it be by machines only (as is looks like today)? or also by humans? mapping: in semapv, sounds like the result

Yes both are correct, see details of discussion here: https://github.com/mapping-commons/sssom/discussions/169

Note that this is not an attempt to be normative. There is no denying that many people use matching only for machines, and other for humans and machines. Many use mapping as a process. This is just to define how the terms are used in the context of SEMPAV (and by extension, SSSOM).

curation:

in semapv, ManualMappingCuration is defined as An matching process that is performed by a human agent and is based on human judgement and domain knowledge. . With this definition, the element should rather be named "ManualMatching" (as is it a process)

otherwise, according to Merriam Webster, curation is the act or process of selecting and organizing (something, such as articles or images) for distribution or publication so this means that mappings already exist, i.e. have been computed. This would then be close to the semapv definition for review

This is a great point. I would argue that the "selecting" part can be seen as "selecting an appropriate identifier"; we have been using mapping curation primarily for the task of "finding an appropriate term to map to". Unfortunately, we cant really change the property now at this stage, as it is two widely used and the churn would be enormous to have everyone update their SSSOM files, but if we would have heard your comment earlier in the process, we would have considered it.. Thanks in any case for making that point!

review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. This should be applicable to mappings created by either a human or a machine

Could you please clarify or harmonize the lexicon used?

Not sure I understand, but can you make perhaps concrete suggestions of how we could clarify the definitions to disambiguate better?

saubin78 commented 1 year ago

Taking a look at the specifications' definitions:

manual mapping curation : An matching process that is performed by a human agent and is based on human judgement and domain knowledge. --> this one is clear enough
mapping review: A process that is concerned with determining if a mapping “candidate” (otherwise determined) is reasonable/correct. --> this one is clear (now)
matching process: An process that results in a mapping between a subject and an object entity. --> either this one should explicitely mention that the matching is performed by a machine OR a suggestion is to keep the definition for "matching process" as it is and have "manual mapping curation" as subclass of "matching process".

matentzn commented 1 year ago

Yes, I agree @saubin78 thanks for this analysis.

Which of the two do you prefer? I personally feel more inclined to having manual mapping curation" as subclass of "matching process". This is more practical IMO. With improving automated methods, the distinction between the two will constantly get less. In the end, the human brain is just a big "pattern matching system". What do you think? cc @graybeal

mapping-commons / semantic-mapping-vocabulary

need to clarify some terms #13