Consider how to effectively reference subgraphs/subsets

subject_source (or object_source of course) reference the source in general, like http://purl.obolibrary.org/obo/hp.owl subject_source_version references a particular version of the resource, like http://purl.obolibrary.org/obo/hp/releases/2020-03-27/hp-base.owl

We often want to say that a mapping set maps, say, all phenotype in subject_source to all diseases in object_source, so we need to effectively be able to express something like subject_subgraph; My feeling is that we should overload the subject_source_version for this purpose and require the supplied link to the actual subgraph of the source used to compute the mapping. If we wanted to denote what the Medical Informatics community call "value sets" ("specifies a set of codes drawn from one or more code systems, intended for use in a particular context"), we could actually define a new field called source_terms (terms being the neutral version of what the MI people call value set, and the ontology world calls a signature of interest). The reason why I am not convinced this is a good idea is that the mapping set itself (the mappings I mean) sort of serve as a proxy for value set: if you mention a term A in the subject_id column, obviously it was also part of the value set that was being mapped.

Maybe all thats needed is overloading the subject_source_version to refer to the exact and complete input graph that was used to perform the mapping? Opinions welcome.

mapping-commons / sssom

Consider how to effectively reference subgraphs/subsets #34