Open prjemian opened 2 years ago
This could (and probably should) be solved in the specific documentation that cites the replacement. The documentation should reference the replacement, rather than the base class target.
The documentation might make such links automatically by a two-step process:
NeXus support two type of main definition categories: base, and application. The basic difference is that the default optionality of the defined elements are “optional” for base_class definitions, but “required” inside application definitions. We can also note that while most of the definitions extend NXobject, a few applications extend another application definition. As the documentation says: “In contrast to NeXus base classes, NeXus supports inheritance in application definitions.”
On the other hand, since the keyword ‘extends’ is so rarely used (not pointing to NXobject) in current applications, it is a question if and how such inheritance is implemented by different tools and how inherited data items are handled inside the new definition.
Only extension (addition of new data items, like groups/fields/attributes) is supported or also override where already introduced elements could even be redefined?
Another question how the data item definitions of base_classes are (re)used in another base class or in an application definition if no inheritance is supported?
Actually, the reuse/reusability is triggered by referencing a base class as a ‘type=’. Here, the assumption is that all data items defined under the tree of the referenced base class and in the trees of the base classes referenced therein will automatically be available for reuse under a (not always) specified ‘name’ in the new definition. Hence, definitions are inherited inside base_classes as well.
When such reference also provides extra definition elements (e.g. doc in case of NXbeam/DATA), this is handled as a specific definition only valid for this item which further specifies the original base definition (in case of NXbeam/DATA(doc), the original NXdata(doc) is actually extended). Reuse in an application definition is the same with the difference that optionality is by default switched to being “required” (E.g. NXareps/ENTRY/title(optional)=False as opposed to NXentry/title(optional)=True). Note that although definitions are inherited, if modifications happen at a specific data item they result in a new item definition (e.g. extending/specialising the documentation, changing optionality, adding new data items, etc.). Such an extended/modified item definition will be then inherited when this item is referenced inside another definition. E.g. NXmy_arpes/ENTRY/arpes_base[NXarpes] /just like NXmy_arepes(NXarpes)/ would inherit NXarpes/ENTRY/INSTRUMENT/analyser[NXdetector]/acquisition_mode(enum:[swept, fixed]) rather then NXdetector(enum=[gated, triggered, summed, event, histogrammed, decimated])
With its ‘type’-referencing definition-reuse functionality, NeXus implements Single, Multilevel and Hierarchical Inheritance (see https://beginnersbook.com/2013/05/java-inheritance-types/): NXarpes/ENTRY/INSTRUMENT/analyser[NXdetector] extends the referenced NXentry/INSTRUMENT/DETECTOR which is referencing NXinstrument/DETECTOR which is referencing NXdetector by new data items including the field ‘energies’.
The inheritance in NeXus allows the reuse of a complete definition tree with all its inherited sub definitions. Overriding a data item can be achieved after referencing it with the corresponding name/type combination. Note that for convenience, doc strings are not overridden, but extended/specialised by default, and any overriding doc string shall explicitly state if inherited doc strings shall not be considered.
Inheritance relationships: IS A - implemented in NeXus by ‘extends=‘ or ‘type=‘ HAS or MAY CONTAIN (depending on optionality) - implemented in NeXus by explicit or inherited sub definitions
Example for retrieving inherited doc strings for NXarpes/ENTRY/DATA while processing a data file:
INFO: ===== GROUP (//entry/data [NXarpes::/NXentry/NXdata]): <HDF5 group "/entry/data" (4 members)>
INFO: classpath: ['NXentry', 'NXdata']
INFO: classes:
NXarpes.nxdl.xml:/ENTRY/DATA
NXentry.nxdl.xml:/DATA
NXdata.nxdl.xml:
INFO: <
.. note:: Before the NIAC2016 meeting [#]_, at least one
...
INFO: documentation (NXdata.nxdl.xml:):
INFO:
:ref:NXdata
describes the plottable data and related dimension scales.
.. index:: plotting
It is mandatory that there is at least one :ref:`NXdata` group
...
Similar example with also retrieving enumeration lists for NXarpes/ENTRY/INSTRUMENT/analyser[NXdetector]/acquisition_mode:
INFO: ===== FIELD (//entry/instrument/analyser/acquisition_mode): <HDF5 dataset "acquisition_mode": shape (), type "|O">
INFO: value: b'fixed'
INFO: classpath: ['NXentry', 'NXinstrument', 'NXdetector', 'NX_CHAR']
INFO: classes:
NXarpes.nxdl.xml:/ENTRY/INSTRUMENT/analyser/acquisition_mode
NXdetector.nxdl.xml:/acquisition_mode
INFO: <
an implementation is available under: https://github.com/nomad-coe/nomad-parser-nexus/blob/bb3ef7693643a7b745ee8c9786dd68d83e361663/nexusparser/tools/nexus.py#L639-L700
short draft:
def get_inherited_nodes(nxdl_path: str = None):
"""Returns a list of ET.Element for the given path."""
# let us start with the given definition file
elist = []
add_base_classes(elist, nxdl_path.split('/')[0])
# walk along the path
for html_name in nxdl_path.split('/')[1:]:
# from low priority inheritance classes to higher
for ind in range(len(elist) - 1, -1, -1):
elist[ind] = get_direct_child(elist[ind], html_name)
if elist[ind] is None:
del elist[ind]
continue
# override: remove low priority inheritance classes if class_type is overriden
if len(elist) > ind + 1 and get_nx_class(elist[ind]) != get_nx_class(elist[ind + 1]):
del elist[ind + 1:]
# add new base class(es) if new element brings such (and not a primitive type)
if len(elist) == ind + 1 and get_nx_class(elist[ind])[0:3] != 'NX_':
add_base_classes(elist)
return elist
def add_base_classes(elist, nx_name=None):
""" add the base classes corresponding to the last element in elist to the list
Note that if elist is empty, a nxdl file with the name of nx_name is used"""
if elist and nx_name is None:
nx_name = get_nx_class(elist[-1])
if elist and nx_name and f"{nx_name}.nxdl.xml" in (e.get('nxdlbase') for e in elist):
return
elem = ET.parse(f"{nx_name}.nxdl.xml").getroot()
elist.append(elem)
# add inherited base classes
if 'extends' in elem.attrib and elem.attrib['extends'] != 'NXobject':
add_base_classes(elist, elem.attrib['extends'])
else:
add_base_classes(elist)
def get_direct_child(nxdl_elem, html_name):
""" returns the child of nxdl_elem which has a name
corresponding to the the html documentation name html_name"""
for child in nxdl_elem:
if get_local_name_from_xml(child) in ('group', 'field', 'attribute') and html_name == get_node_name(child):
return child
@sanbrock Will this issue be resolved in the next days? Is it necessary to resolve this for release of NXDL now?
@prjemian Creating the Vocabulary Table is a good first step. I think, that is enough for the next release. We can then review what shall be the next steps.
Thanks!
E.g. if NXdetector_module/data_origin is overwritten by NXdetector/detector_module/data_origin, and then NXmyappdef:/detector (NXdetector)/detector_module(NXdetector_module) is used will it inherit from NXdetector_module or from NXdetector/NXdetector_module?Proposal: