Is your feature request related to a problem? Please describe.
in jsonschemagen (which is used by linkml-validate) there is an option include_range_class_descendants which enables the generation of one_of for all range class descendants, so that if you have a slot with another class as range, subclasses of the range class are also accepted as valid, which is something i guess one would assume to be the case always (liskov substitution principle)
there are some problems with the current feature however,
its disabled by default. you can enable it in gen-jsonschema but not in linkml-validate which uses jsonschemagen internally. so linkml-validate won't validate your json if you try to use a subclass of the range for an inlined slot.
it uses oneOf. suppose you have two subclasses of a range class that have the same slots (which happens for instance in a taxonomy), json will not validate because oneOf requires that only one of the conditions is true, and fails if both are true. ( i changed this for my poly work in a PR that i want to split up https://github.com/linkml/linkml/pull/1257/commits/6c074d7ace2df865cb10eaf37f9bdb4c13cb83f8)
i guess the reason why it is not default is that, when using linkml to build large taxonomies, and having lots of is_a descendants having no extra slots, you will end up with huge list of anyOf or oneOf with basically identical conditions. This is not a problem for me but i guess this is why this was not the default. The ideal solution would be that the json-schema-gen would auto-simplify the oneOf list and remove redundant entries from it.
so this issue to track progress here and see what needs to be done.
I agree that being enabled by default is better. The reason this wasn't the case was conservatism, we couldn't be sure people were not relying on current behavior. But I think it is better to switch the default and make it easy for people to temporarily disable while they fix any underlying issues
good analysis. It looks like we are now using anyOf?
I am not sure this is a concern, no one should have ontology-style 1000 class taxonomies. But it would be good to check there are no unforeseen performance considerations when using a fairly deep taxonomy like biolink
Is your feature request related to a problem? Please describe.
in jsonschemagen (which is used by linkml-validate) there is an option
include_range_class_descendants
which enables the generation ofone_of
for all range class descendants, so that if you have a slot with another class as range, subclasses of the range class are also accepted as valid, which is something i guess one would assume to be the case always (liskov substitution principle)there are some problems with the current feature however,
its disabled by default. you can enable it in
gen-jsonschema
but not in linkml-validate which uses jsonschemagen internally. so linkml-validate won't validate your json if you try to use a subclass of the range for an inlined slot.it uses
oneOf
. suppose you have two subclasses of a range class that have the same slots (which happens for instance in a taxonomy), json will not validate becauseoneOf
requires that only one of the conditions is true, and fails if both are true. ( i changed this for my poly work in a PR that i want to split up https://github.com/linkml/linkml/pull/1257/commits/6c074d7ace2df865cb10eaf37f9bdb4c13cb83f8)i guess the reason why it is not default is that, when using linkml to build large taxonomies, and having lots of
is_a
descendants having no extra slots, you will end up with huge list ofanyOf
oroneOf
with basically identical conditions. This is not a problem for me but i guess this is why this was not the default. The ideal solution would be that the json-schema-gen would auto-simplify theoneOf
list and remove redundant entries from it.so this issue to track progress here and see what needs to be done.