Open woutdenolf opened 2 years ago
So this basically this requires a better description on the experiment_identifier
, entry_identifier
and collection_identifier
fields in NXentry
. @prjemian @FreddieAkeroyd From the history it seems both of you have worked on this.
collection_identifier
is a group of files (or group of database records, for the Bluesky framework) of which this data is a part.
Sometimes that's a parent folder, could also be a single SPEC data file or set of SPEC files, or a collection in mongodb (as in bluesky).
entry_identifier
is a placholder for the identification of this data provided by the facility.
experiment_identifier
and entry_identifier
seem identical but (probably) provide a distinction to some facilities.
Here's my suggestion
> I'm looking for a destination for the following metadata associated to a data collection:
>
> synchrotron: "ESRF"
> beamline: "ID31"
> proposal id, defined in the scope of the ESRF: "hg-124"
> data collection id, defined in the scope of the proposal: "S_220323_00006_0001"
> sample ID, defined in the user scope: "S-220323-00006"
> sample UUID, defined in the user scope: "4590be84-3493-4bd2-91fe-4cf39cfcf71f"
> name of the technique(s): ["x-ray powder diffraction"]
> command: "fscan 0 10 100 0.1"
# describe the information as provided (NXentry has "experiment_documentation" for this)
/entry/experiment_documentation:NXnote/beamline = "ID31"
/entry/experiment_documentation:NXnote/command = "fscan 0 10 100 0.1"
/entry/experiment_documentation:NXnote/data_collection_id = "S_220323_00006_0001"
/entry/experiment_documentation:NXnote/proposal_id = "hg-124"
/entry/experiment_documentation:NXnote/sample_id = "S-220323-00006"
/entry/experiment_documentation:NXnote/sample_uuid = "4590be84-3493-4bd2-91fe-4cf39cfcf71f"
/entry/experiment_documentation:NXnote/synchrotron = "ESRF"
/entry/experiment_documentation:NXnote/techniques = ["x-ray powder diffraction"]
# fill out the standard base classes
/entry/command -- link to /entry/experiment_documentation/command
/entry/entry_identifier -- link to /entry/experiment_documentation/data_collection_id
/entry/experiment_identifier -- link to /entry/experiment_documentation/proposal_id
/entry/instrument/name -- link to /entry/experiment_documentation/beamline
/entry/instrument/source/name -- link to /entry/experiment_documentation/synchrotron
/entry/instrument/source/probe = "x-ray"
/entry/instrument/source/type = "Synchrotron X-ray Source"
/entry/sample/name -- link to /entry/experiment_documentation/sample_id
# for convenience, but not described in NeXus (so not illegal, either), provide at root level
/@beamline = "ID31"
/@facility = "ESRF"
I went digging though old NIAC minutes for context and found:
experiment_identifier
was part of the original definition of NXentry
run_number
was part of the original NXentry
definition and was documented as "number of run or scan stored in this entry"run_number
with entry_identifier
Git Blame says that the following are 13 years old (i.e. coming the old SVN repository):
This suggests that we need someone who has been in NeXus from the beginning (e.g. @mkoennecke @rayosborn @FreddieAkeroyd ) to get further context on the intention of these fields.
@rayosborn says that the facilities were each doing their own thing with various identifiers and this set was able to satisfy everyone. Actual usage probably varied a lot and nobody was very interested in forcing everyone to adopt the same usage.
We might also find some example usage in files from the example data repository
Comments were made on this issue at 2022-06 Code Camp. @woutdenolf : Is it necessary to resolve this for release of NXDL now?
We can keep it for the next release
Use of each these terms seems to be particular to a subset of facilities. We could benefit from facility examples, how they use (or not) each of these fields
I'm looking for a destination for the following metadata associated to a data collection:
I'm currently thinking about this but I'm not sure I'm using the fields correctly:
I'm especially confused about
experiment_identifier
,entry_identifier
andcollection_identifier
. Could someone clarify those? What I have is a proposal name "hg-124", a data collection name "S_220323_00006_0001" and and the techniques used ["x-ray powder diffraction"].