NeurodataWithoutBorders / nwb-schema

Data format specification schema for the NWB neurophysiology data format
http://nwb-schema.readthedocs.io
Other
52 stars 16 forks source link

neurodata_type_inc namespace #255

Open bendichter opened 5 years ago

bendichter commented 5 years ago

I think we need rules for how to look up neurodata_type_inc across the available namespaces. My vote would be first to search for the datatype in the same namespace, then in the core.

The problem arises when a neurodata_type_inc is defined that contains or inherits from another neurodata_type_inc that is not in either the same namespace or in core but in another namespace. If we want to support this case, and we don't want naming collisions, we'd have to change the schema language.

One possible solution to this would be to add a namespace to be optionally included with a neurodata_type_inc. This could be written as:

neurodata_type_inc:
- name: Surface
  namespace: ecog

If it is included in a neurodata_type_inc, then the neurodata_type_inc is from that namespace. If it is not included, then the search for the datatype first is within the same extension, then in core.

oruebel commented 5 years ago

Do you have a specific use-case where this problem arises right now?

The problem arises when a neurodata_type_inc is defined that contains or inherits from another neurodata_type_inc that is not in either the same namespace or in core but in another namespace.

In the specification of the namespace you specify which namespaces should be included in the namespace of your extensions, e.g:.

schema
- source: nwb.ephys.yaml
  neurodata_types: ElectricalSeries
- namespace: core
  neurodata_types: Interface

As such all types you include here are now visible in your namespace. As such, the look-up is flat and you need to ensure that you do not include multiple types with the same name in your namespace. I.e., if there are colliding type names from different namespaces, then that problem should be solved at the namespace level, not at the level of the individual sources. In practice, I would expect that this sort of problem is typically solved by explicitly including only the types you actually need, rather than pulling every possible type into your namespace. If this really becomes a problem, then I think you may need something like from A import B as C solution on the namespace level (i.e., somehow allow renaming of types when you include them in your schema in the namespace).

bendichter commented 5 years ago

I'd be fine with a from A import B as C solution. So in the spec generation API, would this look like ns_builder.include_type('DynamicTable', namespace='core' as='core_DynamicTable')

If we want users to import specific names, maybe we can nudge them to use ns_builder.include_type statements like import statements at the top of the create_extension_spec.py file. We could also warn them if they are adding a spec that uses neurodata_type_inc on types that have not been included and are not defined within the spec.

Another way to nudge them to do this is to add ns_builder.include_type statements to the extensions tutorial. Should I do that?

oruebel commented 5 years ago

Defining some include_type statements in the tutorial sounds find. For warnings, I think we'd need to create an error when including a namespace that contains typenames that are already defined.

Has this problem actually come up?

bendichter commented 5 years ago

The issue is arising as I am trying to get get_class to work for classes that contain other classes. I realized that I wasn't quite sure how I should be searching for data types across the available namespaces.

bendichter commented 5 years ago

When I hear the word "namespace" I think a logical structure that prevents naming collisions, so I assumed that you were interested in having them provide this functionality.

oruebel commented 5 years ago

I don't think you should need to have to search across namespaces, but all types should appear in the namespace as they are included there and there should never be 2 types with the same name included in the same namespace.

bendichter commented 5 years ago

namespace/neurodata_types is currently optional in the docs. Should we make it a requirement that extensions import specific neurodata_types? Otherwise we'll need to handle the case that they aren't defined.

ajtritt commented 5 years ago

I'm a little confused about the use case here myself. What situation are you addressing by loading a neurodata_type from one namespace into another? This was really meant for a situation where you want to build on top of a neurodata_type that a separate namespace defines. If you just want to use a neurodata_type from a namespace, then the namespace should be just be loaded/imported. This is how namespaces serve as a mechanism for preventing collisions.