linkml / linkml

Linked Open Data Modeling Language
https://linkml.io/linkml
Other
321 stars 100 forks source link

Mixins aren't factored into post-init for pythongen classes #355

Open kevinschaper opened 3 years ago

kevinschaper commented 3 years ago

I'm hitting this problem with generated python classes for the biolink model. It looks like it doesn't recognize that GeneOrGeneProduct is a mixin of Gene.

In the biolink model, Gene is defined with:

    description: >-
      A region (or regions) that includes all of the sequence elements
      necessary to encode a functional transcript. A gene locus may include
      regulatory regions, transcribed regions and/or other
      functional sequence regions.
    is_a: biological entity
    mixins:
      - gene or gene product
      - genomic entity
      - chemical entity or gene or gene product
      - physical essence
      - ontology class
    aliases: ['locus']
    slots:
      - symbol
      - synonym
      - xref ...

and GeneToPhenotypicFeatureAssociation is defined as:

gene to phenotypic feature association:
    is_a: association
    exact_mappings:
      - WBVocab:Gene-Phenotype-Association
    defining_slots:
      - subject
      - object
    mixins:
      - entity to phenotypic feature association mixin
      - gene to entity association mixin
    slot_usage:
      subject:
        range: gene or gene product
        description: "gene in which variation is correlated with the phenotypic feature"
        examples:
          - value: HGNC:2197
            description: "COL1A1 (Human)"

My code

gene = Gene(id='Xenbase:' + row['SUBJECT'], category="biolink:Gene")

phenotype = PhenotypicFeature(id=row['OBJECT'], category="biolink:PhenotypicFeature")

association = GeneToPhenotypicFeatureAssociation(
    category="biolink:GeneToPhenotypicFeatureAssociation",
    id="uuid:" + str(uuid.uuid1()),
    subject=gene,
    predicate="biolink:has_phenotype",
    object=phenotype,
    relation=row['RELATION'].replace('_', ':')
)```

This fails, it’s hitting this spot in the post init,
```if not isinstance(self.subject, GeneOrGeneProduct):
    self.subject = GeneOrGeneProduct(**as_dict(self.subject))

and giving this error


    association = GeneToPhenotypicFeatureAssociation(
  File "<string>", line 20, in __init__
  File "/Users/kschaper/Documents/Monarch/koza/koza/biolink/model.py", line 6376, in __post_init__
    self.subject = GeneOrGeneProduct(**as_dict(self.subject))
  File "<string>", line 4, in __init__
  File "/Users/kschaper/Documents/Monarch/koza/koza/biolink/model.py", line 3350, in __post_init__
    super().__post_init__(**kwargs)
  File "/Users/kschaper/Documents/Monarch/koza/venv/lib/python3.9/site-packages/linkml_runtime/utils/yamlutils.py", line 46, in __post_init__
    raise ValueError('\n'.join(messages))
ValueError:  Unknown argument: id = 'Xenbase:XB-GENE-1000632'
 Unknown argument: iri = None
 Unknown argument: category = ['biolink:Gene']
 Unknown argument: type = None
 Unknown argument: description = None
 Unknown argument: source = None
 Unknown argument: provided_by = []
 Unknown argument: has_attribute = {}
 Unknown argument: symbol = None
 Unknown argument: synonym = []
 Unknown argument: xref = []
 Unknown argument: has_biological_sequence = None```
sierra-moxon commented 3 years ago

It looks like we need to add optional mixin closure in this method: class_identifier_path in generator.py. I did this kind of work for BMT, I can take a stab at this one?