Open mbrush opened 1 year ago
I think you captured this very well. The attribute_type_id conveys to the reader that the value will be primary_knowledge_source (which is_a knowledge_source). So all the reader then needs to know is how will you convey the identity of this knowledge source to it in the value field. via a URI? via a CURIE? via a proper name? All three are strings but value type of those strings are different and which form matters a lot to the reader when interpreting the string. I'd contend that the current value_type_id = biolink:InformationResource isn't useful because the reader already knew that from the attribute_type_id. What it needs to learn is how to interpret the string value. To me, that means URI or CURIE or proper name in this case. That is what I had in mind with value_type_id.
Noting that biolink already has a controlled vocabulary of types that it inherits from LinkML: https://github.com/linkml/linkml-model/blob/main/linkml_model/model/schema/types.yaml
using "CURIE" for this as the type makes sense to me. Do we need anyone else to weigh in on this, or shall we declare it to be the type of thing in the value field?
The TRAPI Attribute object includes a
value_type_id
afield to indicate the "type" the thing reported in the 'Attribute.value` field.We need to decide if we would like this field to capture the more foundational / technical data type of what is in the
value
field (e.g. CURIE, string, float, . . . ) , or a more semantic/ontological type of thing the value concept represents (e.g. "InformationResource", Publication, "Person", "p-value", ...)The example that spurred this question on the 10-13-22 Data Modeling call is below, and concerns the
value_type_id
of "biolink:InformationResource" below.Many felt it would be more useful to capture a more foundational "type" here (e.g. "CURIE" since the value here represented as a CURIE). Especially since the semantic/ontological type of the value will usually be knowable from the range of the Biolink edge property in the
attribute_type_id
field, or from the name of the edge property itself (e.g.biolink:p-value
).While this concerns elements of the TRAPI schema, this is a broader issue concerning modeling conventions, and Biolink support may ultimately be needed to implement our decision (e.g a enumeration of foundational data types to constrain this field). Tagging @edeutsch and @sierra-moxon for their input.