The NOMAD metainfo provides a common representation of data based on a shared schema. Each entry in NOMAD has associated metainfo data that contains all information represented by set entry. This metainfo data is created from raw uploaded data files by NOMAD parsers.
Currently there is only limited functionality in the underlying metainfo python library to reference metainfo objects from different entries (or entries as a whole), data in associated files (e.g. "big binary" files that are not parsed), or data provided by external resources (e.g. via URLs).
This issue is not about how to use references to create a schema for workflows. It is just about how to add references to metainfo data, not what they are used for.
stories
a complex workflow that produced multiple NOMAD entries is represented as an additional NOMAD entry. Its metainfo data shows the workflow execution DAG where nodes reference the respective entries. These entries might reference the workflow that created them.
an experiment entry references a sample entry
experiment results are captured in a large HDF5 binary file (e.g. NEXUS) and its unreasonable to represent it in metainfo. However, the metadata is parsed into metainfo and data is referenced by addressing certain datasets in the HDF5
entry metainfo data references an external resource, e.g. paper, wiki-page, code homepage, external database API, etc.
requirements
the metainfo quantity supports files as a type with specialisations for expected mime-type (e.g. for preview pictures, HDF5, JSON, etc.)
the metainfo quantity supports URLs as a type
the metainfo quantity supports other metainfo sections and quantities as type [already implemented]
reference values can be given as loaded metainfo objects [already implemented]
via entry ids and metainfo paths
via entry or uploads ids and raw file paths
via NOMAD API URLs, e.g. https://oasis.institut.eu/nomad/api/v1/entries/<entry_id>/metainfo/path/to/section/or/quantity
HDF5 data files can be references with additional path segments that describe parts of a HDF5 file
the metainfo browser transparently resolves as much references where possible
implementation
references are internally represented as URL formatted strings; this can be just a path (i.e. for intra-entry references); a path with entry id (i.e. for intra-nomad refernces); or a arbitrary url
the metainfo browser chooses components to represent references based on the quantity type (e.g. for preview pictures, vs. web-resource)
The NOMAD metainfo provides a common representation of data based on a shared schema. Each entry in NOMAD has associated metainfo data that contains all information represented by set entry. This metainfo data is created from raw uploaded data files by NOMAD parsers.
Currently there is only limited functionality in the underlying metainfo python library to reference metainfo objects from different entries (or entries as a whole), data in associated files (e.g. "big binary" files that are not parsed), or data provided by external resources (e.g. via URLs).
This issue is not about how to use references to create a schema for workflows. It is just about how to add references to metainfo data, not what they are used for.
stories
requirements
https://oasis.institut.eu/nomad/api/v1/entries/<entry_id>/metainfo/path/to/section/or/quantity
implementation