FAIRmat-NFDI / pynxtools

https://fairmat-nfdi.github.io/pynxtools/
Apache License 2.0
13 stars 8 forks source link

Use quantity name directly #437

Closed TLCFEM closed 1 month ago

lukaspie commented 1 month ago

Hi @TLCFEM, thanks for the PR, seems like a cleaner solution.

As you can see, the tests are failing in the generation of the nexus metainfo package, I think because some changes recently in the NOMAD metainfo that break how we build the package here. From what I could tell, most (if not all) changes in the metainfo were made by you, so could you please give us a bit of guidance how we can change our code to work with the new version? Thanks!

TLCFEM commented 1 month ago

Someone needs to address this issue. It's broken in the current state.

https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/merge_requests/1173#note_278826

lukaspie commented 1 month ago

Someone needs to address this issue. It's broken in the current state.

https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/merge_requests/1173#note_278826

Alright, thanks for the pointer. This is something that is anyway on our todo as the pattern matching rules were recently changed in NeXus. I'll bring it up in Area B.

TLCFEM commented 1 month ago

There are two options:

  1. keep existing API and change resolve_variadic_name to properly work with the rules.
  2. expose resolve_variadic_name as a callback such that totally offload this task to the external.

Either would be better than what we have now.

lukaspie commented 1 month ago

I think the init_nexus_metainfo is currently failing because we have now started using the NOMAD base sections. Specifically, each NeXus class (that is a subsection in the metainfo) is now inheriting from a specific base section or, by default, from the basic BaseSection. The idea here is to profit from the data model and the normalizers that are defined in the basesections module and map to the schemas that are defined in the other areas.

Now, the problem that we have is that the BaseSection class defines a Quantity name (see here), but in some of the NeXus definitions, we define a group called name (see here for an example) which gets converted to a SubSection. So we have mixed inheritance, i.e., we overwrite the Quantity name with the SubSection name in the derived class.

@TLCFEM based on https://gitlab.mpcdf.mpg.de/nomad-lab/nomad-FAIR/-/merge_requests/2151, I understand that this is not allowed anymore, correct? In that case, we probably have to change the inheritance from the base sections.

cc @sherjeelshabih

TLCFEM commented 1 month ago

We discussed this today and it does not sound like a good idea to do so. The original metainfo system was not designed to achieve this. It is possible to emit a warning instead though.

Depending on what you side wants to do.

TLCFEM commented 1 month ago

@markus1978 There is a need here.

markus1978 commented 1 month ago

@TLCFEM btw. this is different from the discussion with @aalbino2. Lets call these cases A and B ;-). But you are right, you cannot inherit two different things under the same name.

All solutions that I can think of for B are rather complicate, error prone, and undesirable. Can't we simply avoid this? The nexus Metainfo is generated, why not simply change the names here? The schema generation and parser could be aware of this problem, have a list of exceptional properties, and prefix those that lead to double inheritance?

Btw. why is something simple like name a group?

We can further discuss both cases A and B. I guess eventually, we will maybe allow to inherit two things under different names and then use normalize functions to make the syntactically different values semantically meaningful. But this will not be an easy and not a quick fix.

sherjeelshabih commented 1 month ago

@lukaspie as @markus1978 suggests, I'm happy with the approach to rename these groups when we generate metainfo. We do not use these AppDefs, right? If we do use it, we can also handle this renaming during populating from an HDF5 file.

This should be much simpler.

lukaspie commented 1 month ago

Btw. why is something simple like name a group?

This unfortunately comes from classes in the NeXus standard which we did not design, but still want to support for the NeXus community. So we generally need a workaround for such situations.

I agree with what @markus1978 said, it doesn't make sense to have both properties inherited and retained in their original form because we will also have problems with the normalization later on if these concepts semantically mean some different (as they do here). For now, we'll just rename the ones that cause a conflict (see #453) and add a suffix to them. We'll just have to update the exception list if there are changes to the base sections going forward.

Independent from this discussion, I suggest we merge this PR (we need this anyway) and then we'll add the renaming fix on top.