Open rly opened 1 year ago
I suggest adding new classes NWBDatasetIncSpec and NWBGroupIncSpec
That's an interesting proposal. I think from an interface perspective this could work. I think the name IncSpec
is not very intuitive, maybe IncludeDatasetSpec
? How does this affect definition of datasets/groups that do not have a neurodata_type_def but only have a fixed name? Would those also be defined outsite and then included?
I like IncludeDatasetSpec
or maybe InnerDatasetSpec
. Good point. I forgot that use case. No, those should be allowed in this new spec type.
I forgot that use case. No, those should be allowed in this new spec type.
Just to clarify. The primary difference then is that neurodata_type_def
is not allowed in IncludeDatasetSpec
. In some sense, this is the difference then between defining a class and instantiating an instance of it (only the term instantiating a type would be confusing here).
For name, I think describing the "operation" (e.g., include) rather than location (e.g., inner) seems more intuitive to me. The name IncludeDatasetSpec
is not bad, but might be confused with the include term in a program language sense of an include in C/C++ (which may be confusing if the class allows creating datasets specs with fixed name and no type). How about maybe InsertDatasetSpec
?
Just noting another instance where this IncludeDatasetSpec
(or whatever we want to call it) class, automatic checks for nested type definitions, and appropriate documentation around these would have prevented copied and nested spec definitions: https://github.com/nwb-extensions/staged-extensions/pull/44
I will bump this issue up in priority because we will only see more of these instances occur.
What would you like to see added to HDMF?
When building extensions, users are often confused over how to define groups and datasets within a parent group. For example, one could define:
While technically legal, this nested type definition is difficult to parse. This code also hides the fact that the
Parcellation
is now defined and can be reused in other types. As a result, this goes against our schema best practices https://github.com/hdmf-dev/hdmf-schema-language/issues/14This is the preferred code:
However, NWBDatasetSpec serves two different functions here. The first defines the new type. The second says only instances of this type can be used and at least one must be provided. In the first usage where the type is defined,
quantity
makes no sense, andname
really shouldn't be set. In the second usage,neurodata_type_def
should not be used, anddefault_name
doesn't really make sense. Same thing for groups.This is explained in the language, but honestly, I think this is confusing and the third row of the table is distinctly different from the other three and should be separated out as I describe below. https://hdmf-schema-language.readthedocs.io/en/latest/description.html#data-type-def-and-data-type-inc
Is your feature request related to a problem?
No response
What solution would you like?
To make this distinction easier for users, I suggest adding new classes NWBDatasetIncSpec and NWBGroupIncSpec that are the only types allowed in a NWBGroupSpec definition under
groups
anddatasets
.neurodata_type_def
is not allowed here. I would also change the keyword to justneurodata_type
. We would also need to update the tutorials.We can also warn when
quantity
andname
are used in a NWBDatasetSpec that is not a NWBDatasetIncSpec (we cannot restrict it because that would break backwards compatibility).I suggest we also add these new spec types to the hdmf-schema-language.
Do you have any interest in helping implement the feature?
Yes.
Code of Conduct