prjemian / punx

Python Utilities for NeXus HDF5 files
https://prjemian.github.io/punx
5 stars 7 forks source link

test application definition specifications #108

Open prjemian opened 6 years ago

prjemian commented 6 years ago

suitable example from the canSAS NXcanSAS examples:

/home/mintadmin/Documents/eclipse/NXcanSAS_examples/others/Mantid
mintadmin@mint-vm ~/.../others/Mantid $ ll 33837rear_1D_1.75_16.5_NXcanSAS_v3.h5
-rw-r--r-- 1 mintadmin 36440 Sep 15  2017 33837rear_1D_1.75_16.5_NXcanSAS_v3.h5
prjemian commented 6 years ago

@butlerpd, @krzywon : This is a good file to test since it does not adhere to the NXcanSAS standard. (uses uncertainty attribute instead of uncertainties, possibly others)

prjemian commented 6 years ago

@jilavsky: There is a related Mantid issue to revise the code that supports NXcanSAS.

prjemian commented 2 years ago

Agreed. Needs a strategy how to test a data file that is written to an application definition. Data files that use the NXcanSAS application definition are more difficult since NXcanSAS specifies optional content within subgroups at several levels deep.

Pick a simpler application definition and build an example file as a first test. The example files above would be additional tests.

prjemian commented 2 years ago

proposed algorithm

Evaluation of a data file which is written to one (or more) application definition needs some special consideration. The application definition modifies the various base classes that are used to describe fields, groups, attributes, and links.

We know:

So, when validating a data file, check the NX_class attribute of any group is not an application definition. Look for the definition field in any NXentry or NXsubentry group. Grab that application definition and apply it (modify the applicable base classes according to the application definition structure) while validating child content from that point. If an application definition was already defined, report this error.

prjemian commented 2 years ago

Might be best to start with a known good file (hint: create one according to a simpler app_def such as NXiqproc).

With a simple test program (below), discover which of the NeXus HDF5 files in the example data directory use an application definition:

file group definition
02_03_setup.h5 /scan_1
1998spheres.h5 /sasentry_0 NXcanSAS
1998spheres.h5 /sasentry_1 NXcanSAS
33837rear_1D_1.75_16.5_NXcanSAS_v3.h5 /sasentry01 NXcanSAS
33id_spec_22_2D.hdf5 /S22 NXspecdata
DLS_i03_i04_NXmx_Therm_6_2.nxs /entry NXmx
Data_Q.h5 /sasentry01
Qx_rank4_test_data.h5 /sasentry NXcanSAS
USAXS_flyScan_GC_M4_NewD_15.h5 /entry
chopper.nxs /entry
compression.h5 /SASentry
cs_af1410.h5 /AF1410_10 NXcanSAS
cs_af1410.h5 /AF1410_1h NXcanSAS
cs_af1410.h5 /AF1410_20 NXcanSAS
cs_af1410.h5 /AF1410_2h NXcanSAS
cs_af1410.h5 /AF1410_50 NXcanSAS
cs_af1410.h5 /AF1410_5h NXcanSAS
cs_af1410.h5 /AF1410_8h NXcanSAS
cs_af1410.h5 /AF1410_cc NXcanSAS
cs_af1410.h5 /AF1410_hf NXcanSAS
cs_af1410.h5 /AF1410_qu NXcanSAS
example_01_1D_I_Q.h5 /sasentry NXcanSAS
example_mapping.nxs /entry1
example_mapping.nxs /entry_micro
gov_5.h5 /gov_5
prj_test.nexus.hdf5 /entry
scan101.nxs /com_05551 NXentry
verysimple.nx5 /entry
writer_1_3.hdf5 /Scan
writer_2_1.hdf5 /entry
python code ```python import h5py import pathlib import pyRestTable from punx import utils def find_definitions(): path = pathlib.Path(__file__).parent / "punx" / "data" # print(f"{path}: {path.exists() = }") table = pyRestTable.Table() table.labels = "file group definition".split() def get_definition(group): if "definition" in group: ds = group["definition"] definition = utils.decode_byte_string(ds[()]) if isinstance(definition, list): definition = definition[0] return definition for test_file in sorted(path.iterdir()): def report_definitions(parent, nx_class): for item in sorted(parent): if utils.isNeXusGroup(parent[item], nx_class): group = parent[item] definition = get_definition(group) table.addRow((test_file.name, group.name, definition or "")) if nx_class == "NXentry": report_definitions(group, "NXsubentry") if test_file.is_file() and utils.isNeXusFile(test_file): # print(f"{type(test_file)} {test_file.name}") with h5py.File(test_file, "r") as root: report_definitions(root, "NXentry") print(table.reST(fmt="md")) if __name__ == "__main__": find_definitions() ```
prjemian commented 2 years ago

Possibly start by creating a contrived (random data, not actual) NXmonopd test data file and build tests for things that pass and do not pass.