linkml / linkml

Linked Open Data Modeling Language
https://linkml.io/linkml
Other
298 stars 90 forks source link

Generator metadata - Metamodel support #1961

Closed sneakers-the-rat closed 4 months ago

sneakers-the-rat commented 4 months ago

Is your feature request related to a problem? Please describe. Apologies if this has already been proposed elsewhere, i know we have talked about this before, and I did a bit of searching but didn't quite find what i was looking for. plz close this if i missed a matching issue.

The generators each support some different subset of the metamodel, and it can be tricky to know when you can rely on a generator to give you a faithful representation of a given schema. This also makes it difficult to track propagation of changes in the metamodel to the generators - the motivating example in this case being array support.

Generators should have some way to declare what features they support.

Challenges:

Some related/illustrative issues:

Describe the solution you'd like

A start would be to add a classvar to Generator that's just like

class Condition(BaseModel):
    type: Literal['parameter', 'etc']
    key: str
    value: Any

class Feature(BaseModel):
    when: Optional[Condition] = None

class ArrayFeature(Feature):
    anyshape: bool = False
    labeled: bool = False

class GeneratorSupports(BaseModel):
    arrays: bool | ArrayFeature | list[ArrayFeature] = False

where we might allow something like

class PydanticGenerator:
    supports: ClassVar[GeneratorSupports] = GeneratorSupports(
        arrays = [
            ArrayFeature(when={'type':'parameter', 'key': 'array_representation', 'value': 'Numpydantic'}
                anyshape = True
                # ...
            ), # ...
        ]
    )

or we could just flatten the whole thing out. might be easier to start with that since it would be simpler.

Then we would be able to simplify all the special casing in the test_compliance suite (to pick a random example, the information that the SQL generator doesn't support enums is hardcoded here; https://github.com/linkml/linkml/blob/0c3afa90701dd18ce874f6103cc7668684eb697d/tests/test_compliance/test_enum_compliance.py#L215 ), which seems pretty hard to maintain and document to me, even if i really like all the abstraction that went into it that works super well.

How important is this feature? Select from the options below: • Low - it's an enhancement but not crucial for work

let's call this "would make our lives easier, but would require a decent amount of refactoring"

When will use cases depending on this become relevant? Select from the options below: • Long-term - 6 months - 1 year

cmungall commented 4 months ago

Apologies, I had some of this partially written up previously, but not actually online - I made a discussion item from notes I previously had:

This is a bit redundant with what you have here but good to see we are thinking on similar lines!

sneakers-the-rat commented 4 months ago

Oh im very into it. Let me close this and move my comment there bc yours is a way more complete picture than this little sketch, but complementary