Open biochem-fan opened 10 months ago
It comes down to the meaning of generally acceptable. While this Dectris-specific content might be known to custom data file readers, it is not guaranteed that a general reader might understand what to do with the content. The assignment of warning to any validation finding has been given a lower numerical value. Only those findings that are marked as error result in a strong negative value. The overall finding value for the data file gives a very coarse opinion of how general this data file is.
In the Naming Conventions section of the NeXus User Guide, there is a table of Reserved prefixes. Are you using the DECTRIS_
prefix as described there?
As noted in #238, the reserved prefixes feature is evaluated based on the version of the NeXus definitions in use. If I recall, the reserved prefixes were adopted after the v2018.5 release. You can add the latest definitions version (v2022.07) with the punx install
command:
punx install v2020.07
By default, punx will validate using the latest version (by release date) installed locally. If you are using the DECTRIS_
prefix, validation with a more recent version may change the findings of this Dectris-specific content.
To find what definitions versions you have available, run: punx config
Are you using the DECTRIS_ prefix as described there?
No.
You can add the latest definitions version (v2022.07) with the punx install command:
Yes, I am using the latest definitions:
# main user 2023-06-26 08:57:16 d669ffb /home1/XXXX/.config/punx/main
It comes down to the meaning of generally acceptable. While this Dectris-specific content might be known to custom data file readers, it is not guaranteed that a general reader might understand what to do with the content.
My question is, is it acceptable to have items not defined in NeXus? Of course I understand that "a general reader might understand what to do with the content". I don't expect/require general readers use those additional items. They are stored as supplemental information, just for records.
If NeXus standard requires that a NeXus file must not contain additional non-NeXus data, I will make two files, one with NeXus fields only and the other for additional data. But is such separation really necessary? I am new to NeXus and haven't went through all formal specifications, so any suggestions would be welcome.
My question is, is it acceptable to have items not defined in NeXus?
Yes, it is acceptable. Too bad this is not in the list frequently asked questions. I'll fix that soon. The easiest reference to find in the NeXus User Guide is in the NeXus Class Definitions section. These paragraphs under Base classes:
Base class definitions are permissive rather than restrictive. While the terms defined aim to cover most possible use cases, and to codify the spelling and meaning of such terms, the class specifications cannot list all acceptable groups and fields. To be able to progress the NeXus standard, additional data (groups, fields, attributes) are acceptable in NeXus HDF5 data files.
Users are encouraged to find the best defined location in which to place their information. It is understood there is not a predefined place for all possible data.
Validation procedures should treat such additional items (not covered by a base class specification) as notes or warnings rather than errors.
The punx code reports these findings as warnings.
Would you agree that the wording here seems a bit strong and could be relaxed a bit? Instead of "not generally acceptable", it could report "not generally recognized".
Actually, the finding of NOTE or WARN may be very specific to the details of the finding. The first table here lists the different possible findings. Softening the wording of the WARN finding may not be the best solution for this case.
Can you report here some of the WARN findings? Here's my example using the same version of the NeXus definitions and the S2p5min_00070_00001.h5
example file:
(bluesky_2023_3) prjemian@arf:~/.../BCDA-APS/gemviz$ punx config
!!! WARNING: this program is not ready for distribution.
Locally-available versions of NeXus definitions (NXDL files)
============= ====== =================== ======= ==================================================================
NXDL file set cache date & time commit path
============= ====== =================== ======= ==================================================================
a4fd52d source 2016-11-19 01:07:45 a4fd52d /home/prjemian/Documents/projects/prjemian/punx/punx/cache/a4fd52d
v3.3 source 2017-07-12 10:41:12 9285af9 /home/prjemian/Documents/projects/prjemian/punx/punx/cache/v3.3
Schema-3.4 user 2018-05-15 08:24:34 aa1ccd1 /home/prjemian/.config/punx/Schema-3.4
v2018.5 source 2018-05-15 16:34:19 a3045fd /home/prjemian/Documents/projects/prjemian/punx/punx/cache/v2018.5
v2020.1 user 2020-01-31 04:17:34 5c4cfec /home/prjemian/.config/punx/v2020.1
v2022.07 user 2022-08-02 06:43:48 e5e2347 /home/prjemian/.config/punx/v2022.07
main user 2023-06-26 08:57:16 d669ffb /home/prjemian/.config/punx/main
============= ====== =================== ======= ==================================================================
default NXDL file set: main
$ punx validate --report WARN ../tiled-template/dev_sampler/nexus_punx/S2p5min_00070_00001.h5
!!! WARNING: this program is not ready for distribution.
data file: ../tiled-template/dev_sampler/nexus_punx/S2p5min_00070_00001.h5
NeXus definitions: main, dated 2023-06-26 08:57:16, sha=d669ffb453ed5a89ca746f8d440adc1b9a5ecc05
findings
======================================== ====== ============= =======================================
address status test comments
======================================== ====== ============= =======================================
/entry/Metadata WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/AbsIntCoeff WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/AbsInt_Standard WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Beam_x_pixel WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Beam_y_pixel WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/CRL_A4 WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/CenterBS_gain WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/CenterBS_gainUnit WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/CenterBS_phd WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/DetZmotor WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Detector_tilt_y WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/ESAFNumber WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Energy WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/EnergyThres1 WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/ExposureTime WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GISAXS_gain WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GISAXS_gainUnit WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GISAXS_phd WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GIWAXS_gain WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GIWAXS_gainUnit WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GIWAXS_phd WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/GUPNumber WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Heater_inUse WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/IC1_phd WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/IC2_gain WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/IC2_gainUnit WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/It_inUse WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/It_phd WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Lakeshore_Control_Temp WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Lakeshore_Loop1_SetPoint WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Lakeshore_Loop2_SetPoint WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Lakeshore_Sample_Temp WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/NDArrayEpicsTSSec WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/NDArrayEpicsTSnSec WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/NDArrayTimeStamp WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/NDArrayUniqueId WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Q_Standard WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/SAXS_gain WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/SAXS_gainUnit WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/SAXS_phd WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/SDD WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/SRcurrent WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Sample_DataName WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Sample_Description WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Sample_Name WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Sample_Time WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/UserName WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/Wavelength WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/hexH WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/hexV WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/linkam_ci94_errors WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/linkam_ci94_limit WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/linkam_ci94_rate WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/linkam_ci94_status WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/linkam_ci94_temp WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/linkam_ci94_temp2 WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/monoE WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/pinhH WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/pinhV WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/pixel_size WARN validItemName NXcollection contains non-NeXus content
/entry/Metadata/timestamp WARN validItemName NXcollection contains non-NeXus content
======================================== ====== ============= =======================================
summary statistics
======== ===== =========================================================== =========
status count description (value)
======== ===== =========================================================== =========
OK 620 meets NeXus specification 100
NOTE 7 does not meet NeXus specification, but acceptable 75
WARN 61 does not meet NeXus specification, not generally acceptable 25
ERROR 0 violates NeXus specification -10000000
TODO 382 validation not implemented yet 0
UNUSED 0 optional NeXus item not used in data file 0
COMMENT 0 comment from the punx source code 0
OPTIONAL 220 allowed by NeXus specification, not identified 99
--
TOTAL 1290
======== ===== =========================================================== =========
<finding>=94.526432 of 908 items reviewed
NeXus definitions version: main
Here's the assignment of the finding: https://github.com/prjemian/punx/blob/327192fb5ea0edf69699881c531ad4bc5c12b8d9/punx/validations/item_name.py#L172-L174
That code is called from the validator()
method, which reports this table in its documentation: https://github.com/prjemian/punx/blob/327192fb5ea0edf69699881c531ad4bc5c12b8d9/punx/validations/item_name.py#L56-L63
I see inconsistency here in the punx documentation and possibly the assignment of the finding. Visually, these names appear to pass the Regular expression pattern for NXDL group and field names.
What's the output with your file?
punx --version
punx validate --report WARN path/to/your/NeXus/file.h5
Also, punx should not be reporting on content in NXcollection
since it has these special rules:
ignoreExtraGroups="true"
ignoreExtraFields="true"
ignoreExtraAttributes="true"
That is the real problem here. Sorry I did not see that earlier.
Yes, it is acceptable. Too bad this is not in the list frequently asked questions. I'll fix that soon. The easiest reference to find in the NeXus User Guide is in the NeXus Class Definitions section.
I see. That is a relief.
Would you agree that the wording here seems a bit strong and could be relaxed a bit? Instead of "not generally acceptable", it could report "not generally recognized".
Yes, that would be clearer. Thank you very much.
punx --version
!!! WARNING: this program is not ready for distribution.
0.3.4
punx validate --report WARN path/to/your/NeXus/file.h5
!!! WARNING: this program is not ready for distribution.
data file: 377.nxs
NeXus definitions: main, dated 2023-06-26 08:57:16, sha=d669ffb453ed5a89ca746f8d440adc1b9a5ecc05
findings
============================================================================= ====== ============= =======================================
address status test comments
============================================================================= ====== ============= =======================================
/entry/instrument/detector/detectorSpecific WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/auto_summation WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/calibration_type WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/compression WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/countrate_correction_bunch_mode WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/countrate_correction_count_cutoff WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/data_collection_date WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/detector_readout_period WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/eiger_fw_version WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/element WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/flatfield WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/frame_count_time WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/frame_period WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/module_bandwidth WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/nframes_sum WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/nimages WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/nsequences WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/ntrigger WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/number_of_excluded_pixels WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/photon_energy WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/pixel_mask WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/roi_mode WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/software_version WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/summation_nimages WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/test_mode WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/trigger_mode WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/x_pixels_in_detector WARN validItemName NXcollection contains non-NeXus content
/entry/instrument/detector/detectorSpecific/y_pixels_in_detector WARN validItemName NXcollection contains non-NeXus content
============================================================================= ====== ============= =======================================
summary statistics
======== ===== =========================================================== =========
status count description (value)
======== ===== =========================================================== =========
OK 441 meets NeXus specification 100
NOTE 5 does not meet NeXus specification, but acceptable 75
WARN 28 does not meet NeXus specification, not generally acceptable 25
ERROR 2 violates NeXus specification -10000000
TODO 67 validation not implemented yet 0
UNUSED 0 optional NeXus item not used in data file 0
COMMENT 0 comment from the punx source code 0
OPTIONAL 215 allowed by NeXus specification, not identified 99
--
TOTAL 758
======== ===== =========================================================== =========
<finding>=-28847.380608 of 691 items reviewed
NeXus definitions version: main
I generate my NeXus file by adding metadata to
*_master.h5
written by a Dectris EIGER detector. Thus, the file contains many Dectris specific items that are not defined in NeXus.punx
flags them as "WARN: NXcollection contains non-NeXus content" and at the bottom of the output WARN is explained as "does not meet NeXus specification, not generally acceptable".I wonder if having non-NeXus content is really "not generally acceptable".
CC: @phyy-nx