Open turbomam opened 8 months ago
Thanks for bringing this to my attention, @turbomam.
Was there a specific PR for this change (the change to the type
slot)? I think knowing its URL will help us (a) name the Migrator
module and (b) get more info if we need it.
Is creating a CURIe something that involves obtaining something from the outside world (e.g. "minting" or "registering" something) or is it just a matter of combining various pieces of information that are already available in the same Mongo document? If it's the former (i.e. involves accessing the outside world), I may add something to the "adapter" class I'm currently working on, to support it.
Notes:
Good questions.
type
can be seen in that PR's src/schema/basic_slots.yamltype
slot
type
values, in CURIe format, to all data instances in the example data filestype
value for a data instance. The type
value should just reiterate the class_uri
of the class that has been instantiated. Maybe you can just determine that in Python code that by checking the types of each instance, including nested ones. But maybe you will need to open a SchemaView of the schema. Then you can look up SlotDefinitions by name and extract their class_uri
s.from linkml_runtime import SchemaView
schema_yaml_file = "../src/schema/nmdc.yaml"
schema_view = SchemaView(schema_yaml_file)
study_class_obj = schema_view.get_class("CreditAssociation")
print(study_class_obj.class_uri)
prov:Association
_That one is unusual in the sense that the class_uri
isn't equivalent to the default prefix (nmdc
) followed by a colon and the class name. It's important to check_
I can help with this as much or little as you want.
type
slot in nmdc-schema
has string rangetype
slot in berkeley-schema-fy24
has to be a class_uri
class_uri
type
slot (without reasserting that capability if they inherited that capability)type
valueSee: https://github.com/microbiomedata/berkeley-schema-fy24/tree/main/tests
test_all_classes_assert_a_class_uri.py
test_all_classes_can_use_type_slot.py
test_inherited_slots_not_reiterated.py
Check whether https://github.com/microbiomedata/nmdc-schema/issues/1607 encapsulates this issue. See migrator nmdc_schema/migrators/migrator_from_X_to_PR10.py
.
@turbomam can this be closed? I believe in berkeley all classes now have type required with a range of a curie.
The berkeley-schema-fy24 is much stricter about the
type
slot, compared to the current nmdc-schemaCurrently the type of a WorkflowExecutionActivity is
The berkeley-schema-fy24 requires that all data instances of nmdc-schema classes must reiterate the class'
class_uri
as a CURIe.Theoretically there should be a good example of this pattern (for some WorkflowExecution) in the example data files directory already but I haven't found one yet.
cc @Michal-Babins @mslarae13