FAIRmat-NFDI / pynxtools

https://fairmat-nfdi.github.io/pynxtools/
Apache License 2.0
13 stars 8 forks source link

[Bug]: Invalid groups with linked children don't get detected as undocumented #429

Open rettigl opened 2 months ago

rettigl commented 2 months ago

Contact Details

rettig@fhi-berlin.mpg.de

What happened?

If the converter gets a path which contains a link, but some undocumented elements in the path, it is not recognized as undocumented by the writer. Example from NXmpes

 "/ENTRY[entry]/SAMPLE[sample]": {
    "bias": {
      "voltmeter": "@link:/entry/instrument/manipulator/sample_bias_voltmeter"
    }
}

Does not produce a warning, even though "bias" is not defined, but rather "bias_env". "bias" also does not get an NX_class attribute.

Relevant log output

No response

rettigl commented 2 months ago

/entry/sample/bias gets namefitted to /ENTRY/SAMPLE/BEAM/TRANSFORMATIONS ...

This code is clearly not functioning completely.

lukaspie commented 2 months ago

Isn't this a consequence of how we defined our namefitting algorithm? If bias is not explicitly defined in NXsample, it tries to namefit and BEAM in NXsample is the closest fit because of the b at the beginning. Then inside NXbeam, there is only TRANSFORMATIONS as a renameable group, so that's what it's fitting voltmeter to. So I don't think it's a bug.

This is why I am not a big fan of our current namefitting algo, but one needs to do something for these renameable fields, so I currently don't see any alternative. For NXsample, it is particularly bad because of the many uppercase groups.

The problem with throwing a warning is: what if I actually wanted to name my BEAM bias? I know, not very realistic, but there are other cases where something similar might happen. In that case, do we want to have a warning whenever we namefit a fully uppercase name?

rettigl commented 2 months ago

I think the solution would be to not do automatic name fitting, but always require to provide the NXCLASS in such cases (or in case of fields also the FIELDNAME).

rettigl commented 2 months ago

Otherwise this produces garbage, as here, and does not detect errors.

lukaspie commented 1 month ago

Note that during NIAC2024, a new resolution was adopted for uppercase notation: https://github.com/nexusformat/definitions/issues/1436. After implementing this, we would probably get rid of these ambiguities.

See issue here: https://github.com/FAIRmat-NFDI/pynxtools/issues/441

lukaspie commented 1 month ago

I think the solution would be to not do automatic name fitting, but always require to provide the NXCLASS in such cases (or in case of fields also the FIELDNAME).

We discussed this today and will adopt this to throw an error. Will look into the implementation.