FAIRmat-NFDI / nexus_definitions

Definitions of the NeXus Standard File Structure and Contents
https://manual.nexusformat.org/
Other
5 stars 8 forks source link

Sprint22 matwerk #248 #258

Closed mkuehbach closed 1 day ago

mkuehbach commented 1 day ago
  1. There are a lot of required elements, are they always there? Especially those that are stated to be optional in the docs.

I have adjusted this slightly, if an analysis step has been executed what was computed is documented

2. Is it possible to merge the two application definitions described here? I.e., have the configs and results for each process together? I don't know too much about the workflow, so maybe that's nonsense.

Technically yes but I do not want this the idea of these workflows is you have a well-defined configuration + well-defined input you pass this to a blackbox (which does some processing based on the cfg and input) that returns an artifact bundling results together, the results are not so large in data volume nor out-of-core computing or parallel I/O is required therefore all can go together in one output file, HDF5 internally stores time data thus blending input and output together would require one to either recreate the config file if you run on another day another run with say a different identifier. For this reason I decided to split apart always in my tools config from results.

lukaspie commented 1 day ago
  1. Is it possible to merge the two application definitions described here? I.e., have the configs and results for each process together? I don't know too much about the workflow, so maybe that's nonsense.

Technically yes but I do not want this the idea of these workflows is you have a well-defined configuration + well-defined input you pass this to a blackbox (which does some processing based on the cfg and input) that returns an artifact bundling results together, the results are not so large in data volume nor out-of-core computing or parallel I/O is required therefore all can go together in one output file, HDF5 internally stores time data thus blending input and output together would require one to either recreate the config file if you run on another day another run with say a different identifier. For this reason I decided to split apart always in my tools config from results.

Ok, makes sense, I'll approve then.