linkml / linkml-model

Link Modeling Language (LinkML) model
https://linkml.github.io/linkml-model/docs/
34 stars 16 forks source link

`any_of` should accept a list of `ArrayExpression`s #199

Open sneakers-the-rat opened 3 months ago

sneakers-the-rat commented 3 months ago

In translating NWB, I am replacing my ArrayLike class array representations with the linkml array spec. NWB models support a dims and shape list of list expression such that a given dataset can have several different possible shapes: https://schema-language.readthedocs.io/en/stable/specification_language_description.html#dims

Given the implementation of array at a slot level rather than a type, I need to be able to use it in any_of like this:

classes:
  MyClass:
    attributes:
      name: my_array
      range: integer
      any_of:
      - array:
          dimensions:
          - alias: x
            exact_cardinality: 3
      - array:
          dimensions:
          - alias: x
            exact_cardinality: 3
          - alias: y
            exact_cardinality: 4

this, of course, opens up the potential for combinatorics between any_of used with range and array like this:

any_of:
- range: integer
- range: float
- array:
    dimensions:
    - alias: x
      exact_cardinality: 3
- array:
    dimensions:
    - alias: x
      exact_cardinality: 3
    - alias: y
      exact_cardinality: 4

which I would interpret as meaning the product:

and like this:

any_of:
- range: integer
  array:
    dimensions:
    - alias: x
      exact_cardinality: 3
- range: float
  array:
    dimensions:
    - alias: x
      exact_cardinality: 3
    - alias: y
      exact_cardinality: 4

which would be (unambiguously)

and i actually think that's a good thing - if we refactor the generators like https://github.com/linkml/linkml/pull/2019, separating the different phases of the build and using build results objects, then it would be easy to handle the combinatorics there (and i would implement it). It doesn't explode the complexity of arrays, it would just require better iteration in the generators (which is good anyway).

without this change, I don't think there would be a way to express a slot being able to have multiple exactly parameterized/labeled array shapes.

cc @rly

cmungall commented 3 months ago

For your second example, each any_of element would be interpreted independently, and then any_of would create a union expression over them; i.e. in Python

Union[int, float, NDArray[<R>, 3], NDArray[<R>, 4]

The value of <R> would be determined by any inherited range, which may default to something like string

In order to derive

Union[
  NDArray[Shape['3'], int]
  NDArray[Shape['3, 4'], int]
  NDArray[Shape['3'], float]
  NDArray[Shape['3, 4'], float]
]

You would need something explicit like your 3rd expression?

sneakers-the-rat commented 3 months ago

Sure! Whichever interpretation. That works for me. I was just trying to think thru the edge cases in an implementation and treating each independently makes it easier

sneakers-the-rat commented 2 months ago

Here's a quick monkeypatch that meets my needs until we can get #200 merged -

def patch_array_expression() -> None:
    """
    Allow SlotDefinitions to use `any_of` with `array`

    see: https://github.com/linkml/linkml-model/issues/199
    """
    from dataclasses import field, make_dataclass
    from typing import Optional

    from linkml_runtime.linkml_model import meta

    new_dataclass = make_dataclass(
        "AnonymousSlotExpression",
        fields=[("array", Optional[meta.ArrayExpression], field(default=None))],
        bases=(meta.AnonymousSlotExpression,),
    )
    meta.AnonymousSlotExpression = new_dataclass