biolink / biolinkml

DEPRECATED: replaced by linkml
https://github.com/linkml/linkml
Creative Commons Zero v1.0 Universal
23 stars 12 forks source link

Slots declarations are ignored under some circumstances #113

Closed cmungall closed 4 years ago

cmungall commented 4 years ago

Test case:

id: https://microbiomedata/schema

prefixes:
  biolinkml: https://w3id.org/biolink/biolinkml/

imports:
  - biolinkml:types

classes:

  named thing: {}

  test class:
    slots:
      - test attribute 1
      - test attribute 2

slots:

  attribute:
    domain: named thing

  test attribute 1:
    is_a: attribute

  test attribute 2: {}

we expect the class test class two have the two declared slots.

However, when I run pipenv run gen-py-classes schema/test.yaml

I get a single attribute

...
class NamedThing(YAMLRoot):
    _inherited_slots: ClassVar[List[str]] = []

    class_class_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/NamedThing")
    class_class_curie: ClassVar[str] = None
    class_name: ClassVar[str] = "named thing"
    class_model_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/NamedThing")

@dataclass
class TestClass(YAMLRoot):
    _inherited_slots: ClassVar[List[str]] = []

    class_class_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/TestClass")
    class_class_curie: ClassVar[str] = None
    class_name: ClassVar[str] = "test class"
    class_model_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/TestClass")

    test_attribute_2: Optional[str] = None
...

It is missing the attribute that inherits from another attribute that has a domain constraint

@hsolbrig I am worried this is the ghost of the induced domain issue

The json-schema is even odder:

{
   "$id": "https://microbiomedata/schema",
   "$schema": "http://json-schema.org/draft-04/schema#",
   "definitions": {
      "attribute": {
         "type": "string"
      },
      "test_attribute_1": {
         "type": "string"
      },
      "test_attribute_2": {
         "type": "string"
      }
   },
   "properties": {
      "NamedThing": {
         "description": "",
         "properties": {
            "attribute": {
               "type": "string"
            },
            "test_attribute_1": {
               "type": "string"
            }
         },
         "title": "NamedThing",
         "type": "object"
      },
      "TestClass": {
         "description": "",
         "properties": {
            "test_attribute_1": {
               "type": "string"
            },
            "test_attribute_2": {
               "type": "string"
            }
         },
         "title": "TestClass",
         "type": "object"
      }
   },
   "title": "schema",
   "type": "object"
}
cmungall commented 4 years ago

@deepakunni3 @wdduncan and I isolated this to pythongen in:

        # Root keys and identifiers go first.  Note that even if a key or identifier is overridden it still
        # appears at the top of the list, as we need to keep the position
        slot_variables = self._slot_iter(cls, lambda slot: (slot.identifier or slot.key) and not slot.ifabsent,
                                         first_hit_only=True)

attribute 1 lacks identifier and key. It's not clear where these should be set, with attribute 2 has it, and attribute 1 does not

cmungall commented 4 years ago

This is what I would expect to see for the python:

@dataclass
class TestClass(YAMLRoot):
    _inherited_slots: ClassVar[List[str]] = []

    class_class_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/TestClass")
    class_class_curie: ClassVar[str] = None
    class_name: ClassVar[str] = "test class"
    class_model_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/TestClass")

    test_attribute_1: Optional[str] = None
    test_attribute_2: Optional[str] = None
cmungall commented 4 years ago

Note: a workaround is possible by always adding a slot_usage:

id: https://microbiomedata/schema

prefixes:
  biolinkml: https://w3id.org/biolink/biolinkml/

imports:
  - biolinkml:types

classes:

  named thing: {}

  test class:
    slots:
      - test attribute 1
      - test attribute 2
    slot_usage:
      test attribute 1: {}
      test attribute 2: {}

slots:

  attribute:
    domain: named thing

  test attribute 1:
    is_a: attribute

  test attribute 2: {}

this makes the expected Python:

@dataclass
class TestClass(YAMLRoot):
    _inherited_slots: ClassVar[List[str]] = []

    class_class_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/TestClass")
    class_class_curie: ClassVar[str] = None
    class_name: ClassVar[str] = "test class"
    class_model_uri: ClassVar[URIRef] = URIRef("https://microbiomedata/schema/TestClass")

    test_attribute_1: Optional[str] = None
    test_attribute_2: Optional[str] = None

The json-schema is better, but it's still inducing slots at the parent level which is wrong:

{
   "$id": "https://microbiomedata/schema",
   "$schema": "http://json-schema.org/draft-04/schema#",
   "definitions": {
      "attribute": {
         "type": "string"
      },
      "test_attribute_1": {
         "type": "string"
      },
      "test_attribute_2": {
         "type": "string"
      }
   },
   "properties": {
      "NamedThing": {
         "description": "",
         "properties": {
            "attribute": {
               "type": "string"
            },
            "test_attribute_1": {
               "type": "string"
            }
         },
         "title": "NamedThing",
         "type": "object"
      },
      "TestClass": {
         "description": "",
         "properties": {
            "test_attribute_1": {
               "type": "string"
            },
            "test_attribute_2": {
               "type": "string"
            }
         },
         "title": "TestClass",
         "type": "object"
      }
   },
   "title": "schema",
   "type": "object"
}
hsolbrig commented 4 years ago

It appears to behave as advertised. OK to close