Closed domna closed 2 months ago
I like the overall idea and agree that this is something that is needed. I am also not sure what would be the best approach.
Additionally or instead provide a depends field for the NXdata group to denote that it was linked to somewhere else. This is essentially a shortcut for writing out AXISNAME_depends + DATA_depends if they belong to the same target group.
In general, I have the feeling that NeXus more often uses a depends_on
field or a @depends_on
attribute to denote such concepts, instead of ELEMENT_depends
. So maybe we could just use attributes @depends_on
:
NXdata:
\@depends_on(NX_CHAR):
doc: |
Points to the path of a field defining the data on which the `DATA` field depends.
[...]
AXISNAME(NX_NUMBER):
\@depends_on(NX_CHAR):
Points to the path of a field defining the axis on which the ``AXISNAME`` axis depends.
Then, these attributes would anyway be associated with the group/field, so there would be no need for the uppercase notation.
In any case, I would argue it should be the same for the NXdata
group and the corresponding AXISNAME
fields. That is, if we keep AXISNAME_depends
, we should also use DATA_depends
(as @domna originally suggested). This would also be in line with having \@AXISNAME_indices
as an immediate child of the NXdata
group.
I would vote for a generic \@depends_on
attribute, because it is a clean approach, and we do not have to specify what we mean for every element.
In any case, I would argue it should be the same for the NXdata group and the corresponding AXISNAME fields. That is, if we keep AXISNAME_depends, we should also use DATA_depends (as @domna originally suggested). This would also be in line with having \@AXISNAME_indices as an immediate child of the NXdata group.
I agree and I think we cannot fully get rid of the AXISNAME_depends
. If we only allow a top-level @depends
we can only reference full NXdata groups but especially for the different axes we want to gather them from different transformations, i.e., explicitly not from a single NXdata group. Therefore, my suggestion is that we do DATA_depends
and additionally allow a top-level depends
which allows for referencing a whole NXdata group, i.e., it's a convenience for writing out all the depends if they come from the same group. Alternatively, we just omit the latter.
However, I was also thinking about the name _depends
in general. For me it reads that the axis values itself depend on another field. This means that the values have to somehow stacked on top the other field. What we actually do is reference/link something.
I agree and I think we cannot fully get rid of the
AXISNAME_depends
.
I think with Luka's suggestion to just make these attributes to the AXISNAME and DATA fields, we can get rid of them. I would suggest to not re-use @depends_on, because this is as far as I know reserved for NXtransformations. We can call it e.g. @ reference or @ source or so.
I agree and I think we cannot fully get rid of the
AXISNAME_depends
.I think with Luka's suggestion to just make these attributes to the AXISNAME and DATA fields, we can get rid of them. I would suggest to not re-use @depends_on, because this is as far as I know reserved for NXtransformations. We can call it e.g. @ reference or @ source or so.
Yes, I just generally meant that we cannot remove the depends
associated with AXISNAME
and DATA
(that is as _depends
or as attribute) and only use the top-level depends.
I like @reference
. So the proposal based on Lukas' suggestion would be:
NXdata:
\@reference(NX_CHAR):
doc: |
Points to the path of a field defining the data to which this NXdata group refers.
[...]
AXISNAME(NX_NUMBER):
\@reference(NX_CHAR):
Points to the path of a field defining the axis to which this ``AXISNAME`` axis refers.
DATA(NX_NUMBER):
\@reference(NX_CHAR):
Points to the path of a field defining the axis to which the ``DATA`` refers.
We could also propose @reference
as a general concept for when an array is taken/linked to somewhere else as we do this quite a few times in the appdef. That way users always know where the data comes from as hdf5 linking is not really visible when viewing a file.
I agree that having an explicit way how to define that a concept A has a connection to another concept B in addition to the explicit parent, child relationships (that come with the concept tree that a base class and appdef defines) makes sense.
My argument always against using \@depends_on (although conceptually this makes sense) was that the symbol "depends_on" should be reserved for cases of NXtranslations. In this spirit I support "reference" I also can see an argument to use "refers_to" or: and this I would find better why at all use one symbol only? The point is the more we add these explicit decoration (of which I am not against of doing) is that we approach an implementation of how one would make statements in e.g. RDF, then why not use e.g. "has_a", "is_a", "is_equivalent_to" not also as \@has_a symbols in NeXus? This could also be used then in the NeXusOntology. The only point why I am hesitant going down this route is that we could then alternatively start off with formulating these references and additional connections between concepts directly using accepted semantic web technology thus watering our message.
One thing I am not sure I understand: why is @depends_on
reserved for only one base class? The point I meant was why not to use it generally when the meaning is quite clear?
As long as we express that the current element needs the definition of a previous one, I think it is a clear definition.
I am also fine using another synonym, but the point I am pressing here: pick one, and make it usable in all cases. Perhaps implying a class binding as well, that is a NXdata refers back to data only (numeric array), an NXtransformations to another NXtransformations, etc.
I find defining many synonyms with all special cases confusing and difficult to keep up with (I know, I am not clever enough for that...)
The point is the more we add these explicit decoration (of which I am not against of doing) is that we approach an implementation of how one would make statements in e.g. RDF, then why not use e.g. "has_a", "is_a", "is_equivalent_to" not also as @has_a symbols in NeXus? This could also be used then in the NeXusOntology.
I agree with you that these terms are very useful for our description work.
However, in this case here I would say that the concept of @reference
is slightly different. It basically refers a dataset from somewhere else when I create an entity of data. Something like same_as
feels similar but lies more on the conceptual level for me, e.g., I want to say that the NXsample/temperature
is the same as NXmanipulator/sample_holder/temperature
and we just create synonyms for the same data. This fixes everything on the appdef level, always. Contrary, the reference is a dynamic link which can especially be useful in the NXdata case where we leave it up to the user to construct the data themself.
One thing I am not sure I understand: why is @depends_on reserved for only one base class? The point I meant was why not to use it generally when the meaning is quite clear?
I agree with you and I think that @depends_on
is also useful in other contexts. However, here I think the term "depends" (in whatever form) shouldn't be used in the first place as these values don't depend on, a.k.a. stack with the previous one. They rather reference the value from somewhere else.
LGTM. We should, however, also change the respective appdefs that use these concepts (NXmpes, NXmpes_arpes, NXxps (?)).
I rebased this branch and then used the @reference
attribute in NXmpes
and NXmpes_arpes
. For NXxps
, I would handle it in #249.
IMO, we can merge this PR now.
This enables the option to describe where the signal data of an NXdata comes from. It's a similar concept as the
AXISNAME_depends
. It's just a proposal open for discussion and I'm not entirely sure if this is the right path to do it. So I welcome your feedback @lukaspie and @rettigl. Also general feedback from @FAIRmat-NFDI/areab is welcome.Different options could be:
depends
field for the NXdata group to denote that it was linked to somewhere else. This is essentially a shortcut for writing outAXISNAME_depends
+DATA_depends
if they belong to the same target group.This is coming from a recent discussion with Anders Hahlin and Anders Frisk. They are trying to combine multiple detector readings into the top-level
data
(e.g., for spin-resolved measurements).