nexpy / nxvalidate

NeXus validation tools and libraries
Other
0 stars 3 forks source link

nxopen() failure with __enter__ #21

Open j-woz opened 1 month ago

j-woz commented 1 month ago

I probably have this misconfigured somehow- nxopen() is returning an object that cannot be used by the Python context manager:

Example here: https://github.com/j-woz/nxvalidate/actions/runs/11167580273/job/31044090443

  File "/home/runner/.local/lib/python3.10/site-packages/nxvalidate/validate.py", line 682, in validate
    with nxopen(self.filename) as root:
AttributeError: __enter__
Error: Process completed with exit code 1.
j-woz commented 1 month ago

Is this object supposed to be an NXentry or NXfile?

j-woz commented 1 month ago

Ah- maybe this is supposed to be an NXroot. So is this a case where my file is malformed? It is generated with: https://github.com/j-woz/nxvalidate/blob/main/tests/0001-minimal.py

rayosborn commented 1 month ago

The context manager returns an NXroot object.

with nxopen(filename) as root:
    root['entry'] = NXentry()
...

It's not normally necessary to ever invoke a NXfile object.

j-woz commented 1 month ago

Is it possible a broken file format is causing this object type to be wrong? I get:

+        t = nxopen(self.filename)
+        print(str(t))
+        print(type(t))
         with nxopen(self.filename) as root:

=>

root
<class 'nexusformat.nexus.tree.NXentry'>
Traceback (most recent call last):
prjemian commented 1 month ago

Are you actually opening the file twice? Could that be the problem?

j-woz commented 1 month ago

With that debugging output, yes, it is being opened twice, but I was getting this issue before that.

rayosborn commented 1 month ago

What file are you opening? Here is what I get with one of my files:

>>> from nexusformat.nexus import *
>>> t=nxopen('chopper.nxs')
>>> print(str(t))
root
>>> print(type(t))
<class 'nexusformat.nexus.tree.NXroot'>
j-woz commented 1 month ago

It is created with this script from the new test suite: https://github.com/j-woz/nxvalidate/blob/main/tests/0001-minimal.py

j-woz commented 1 month ago

chopper.nxs works for me: https://github.com/j-woz/nxvalidate/actions/runs/11170205047/job/31052547457

rayosborn commented 1 month ago

I haven't exactly worked out what is going on with your example, but it's best not to use the NXFile functions directly. There are two ways of saving a file without invoking the NXFile.writefile function directly:

x = y = np.linspace(0, 2*np.pi, 101)
X,Y = np.meshgrid(y, x)
z = np.sin(X) * np.sin(Y)

root = NXroot(NXentry())
root['entry/data'] = NXdata(z,[x,y])

root.save(file_name)

or

x = y = np.linspace(0, 2*np.pi, 101)
X,Y = np.meshgrid(y, x)
z = np.sin(X) * np.sin(Y)

with nxopen(file_name, "w") as root:
    root['entry'] = NXentry()
    root['entry/data'] = NXdata(z,[x,y])

I think you were working from the doctoring examples, which should be updated to reflect best practices.

j-woz commented 1 month ago

Yes, it is based on the online docs. That produces a structure like:

HDF5 "/tmp/test-data/0001-minimal.nxs" {
GROUP "/" {
   ATTRIBUTE "HDF5_Version" {
   ATTRIBUTE "NX_class" {
   ATTRIBUTE "file_name" {
   ATTRIBUTE "file_time" {
   ATTRIBUTE "h5py_version" {
   ATTRIBUTE "nexusformat_version" {
   GROUP "data" {
      ATTRIBUTE "NX_class" {
      ATTRIBUTE "axes" {
      ATTRIBUTE "signal" {
   GROUP "entry" {

A good structure looks like:

HDF5 "/tmp/test-data/0003-nxopen-w.nxs" {
GROUP "/" {
   GROUP "entry" {
      ATTRIBUTE "NX_class" {
      GROUP "data" {
         ATTRIBUTE "NX_class" {
         ATTRIBUTE "axes" {
         ATTRIBUTE "signal" {

So, I suggest nxinspect checks for the GROUP "entry" using a lower-level API before attempting the nxopen().

j-woz commented 1 month ago

Ok- the root issue is that test 0001 has top-level attribute NX_class=NXentry , where test 0003 has no top-level attribute, so the nxclass defaults to 'NXroot' in tree.py:812 . This value is used to reassign the object class by assigning to __class__ . NXroot supports context manager, NXentry does not. I can test for this...

j-woz commented 1 month ago

Here is the proposed fix: https://github.com/nexpy/nxvalidate/commit/c5dca6487edac2318da94d4d604874480e1b634e

It works with the test suite here: https://github.com/j-woz/nxvalidate/actions/runs/11187264444/job/31103970077#step:8:29

prjemian commented 1 month ago

In this code:

            if ('NX_class' in fp.attrs
                    and fp.attrs['NX_class'] == 'NXentry'):
                raise NeXusError(
                    'top-level class is NXentry: '
                    'must be NXroot or unset')

The only allowed value is "NXroot". The test here is not stringent. I suggest this re-write:

            attr = fp.attrs.get("NX_class", "NXroot")
            if attr != "NXroot":
                raise NeXusError(f'top-level class is {attr}: must be NXroot or unset')
rayosborn commented 1 month ago

Whether or not the NX_class attribute has been set is handled by the nexusformat API, so if you read the file using nxopen, groups will be automatically assigned to classes according to the attribute value. The NX_class attribute is then removed from the list of group attributes. You would only know it exists if you are using h5py directly, which I don't recommend, so you should never have to worry about it. If the NX_class attribute is missing, then the group class is NXgroup, which is not an approved class so should generate an exception. I"m not sure that NXvalidate handles the nature of the exception properly yet, but that's another issue.

rayosborn commented 1 month ago

I just noticed that the NXroot NXDL file is the only one to list NX_class explicitly as an attribute. This is, I believe, a mistake. The NXDL class is defined by the type attribute attached to the group tag, and the details of its implementation in the HDF5 file should be hidden from the user. I suspect it was added because NXroot is not a normal class, since it is technically not a group at all - it just represents the file level. In the nexusformat API, this is assigned to be a group of NXroot class even if the NX_class attribute is not defined.

I don't know how the C-API handled this; perhaps it expected to find a NX_class attribute as a file-level attribute. Personally, I don't think it should be specified in the NXDL file at all. I will raise this with the NIAC.