Open bendichter opened 5 years ago
1. Inform users ...
Having a "best practice" that types should be defined at the top level, seems reasonable either way.
2\. You could see this freedom of the schema language as a bug ...
From the perspective of the language I don't see why this would be considered a bug. While specifying new types at the top level may be a good practice, being able to define new types in nested levels seems perfectly reasonable. However, I understand that restricting it to the top-level may help simplify implementation of the schema language.
3\. We could build logic to search the tree for new types.
My current inclination is that this is probably what we should do.
There are two ways to define a new
Container
type in the spec language. You can either define aContainer
type in the/groups
of the schema with a_def
field (and optionally an_inc
) and then use it in another place with an_inc
field (and no_def
field), or you can define it and use it at the same time by including_def
and_inc
fields in a nested spot of the schema. Most types are defined in/groups
and/datasets
and included separately, but there are cases in the NWB:N core schema where types are defined as they are used, e.g.https://github.com/NeurodataWithoutBorders/pynwb/blob/a081b39803f0ffb535b2198a2fdd81007ffa442e/src/pynwb/data/nwb.file.yaml#L234-L243
As far as I can tell, this capability is purely for convenience and offers no additional features or expressive power for the defined types. Currently our
get_class
(i.e.type_map.get_container_cls
in hdmf) function only works at the surface level (in'/groups'
) and will not register nestedContainer
definitions. As a result, there are schemas that follow the schema language rules, but for which the API auto-generation will not work. I see a few possible solutions.Here is a minimal example. The last 4 commands are the critical piece.
1) Inform users that if they want to auto-generate the API, all type definitions must be in
/groups
and/datasets
. This is a limitation only in style and not in function. The biggest downside here is that the rules used for writing extensions will be slightly more strict than those for the core (again, only stylistically).2) You could see this freedom of the schema language as a bug, since it provides multiple right ways to do something without offering any benefit. We could change the schema language rules so that you can only define types at the surface and change the schema to match. This should not cause any compatibility issues for the core, but would raise some eyebrows. If we are going to to this we should do it now before community extensions start to accumulate.
3) We could build logic to search the tree for new types.
Thoughts?
Checklist