FAIRmat-NFDI / pynxtools

https://fairmat-nfdi.github.io/pynxtools/
Apache License 2.0
13 stars 8 forks source link

Add NX docstring as attribute #415

Open lukaspie opened 3 months ago

lukaspie commented 3 months ago

@rettigl this is a way to add the NX docstrings to the HDF5 files as attributes. The idea is that you can pass the write-docs flag to the dataconverter and then docstrings are added. By default, this is turned off so as to not change any existing workflows. We can discuss if we want this to happen by default.

Here's an example using the xps reader: output.nxs.zip

The implementation is relatively trivial. There are however two open questions for me

sherjeelshabih commented 3 months ago
  • Docs for NeXus attributes NeXus attributes are written as HDF5 attributes already. Since HDF5 attributes cannot have attributes themselves, the question is where to place the docs for these attributes? My solution here: write another attribute <attribute>__docs (e.g. entry/definition/version__docs) to the HDF5 file.

Any practical solution works in my opinion. The only issue will be how to report/add this in the NXDL structure. It's a bit meta over the already meta attributes we have in the NXDL. Will something like adding this renameable field, FIELDNAME__docs, in NXobject work out nicely in the NXDL framework?

lukaspie commented 3 months ago
  • Docs for NeXus attributes NeXus attributes are written as HDF5 attributes already. Since HDF5 attributes cannot have attributes themselves, the question is where to place the docs for these attributes? My solution here: write another attribute <attribute>__docs (e.g. entry/definition/version__docs) to the HDF5 file.

Any practical solution works in my opinion. The only issue will be how to report/add this in the NXDL structure. It's a bit meta over the already meta attributes we have in the NXDL. Will something like adding this renameable field, FIELDNAME__docs, in NXobject work out nicely in the NXDL framework?

I didn't go this far. For groups and fields, I basically added an attribute @docs, whereas for the attributes, I used an additional attribute <attribute>__docs. I guess the problem would be that all of these are undocumented..

sherjeelshabih commented 3 months ago

Ah alright. So it's just for the attributes that you add a suffix.

You're right. They will remain undocumented. Let's say to see how it goes in use for us we can leave it undocumented.

It will make it practically easier to understand Nexus files like this. It makes the Nexus files more self sufficient too. And it seems this is the best we can do without overcomplicating it.

lukaspie commented 3 months ago

Notes from TF meeting: