Closed pamfilos closed 8 years ago
I would try to use following structure:
|- jsonschemas/
| |- definitions/
| | |- basic-metadata-v1.0.0.json
| |- records/
| | |- atlas/
| | | |- main-measurement-v1.0.0.json
| | |- cms/
| | |- lhcb/
jsonschemas/<system type (records, deposits, files, ...)>/<experiment>/<document type>-v{version}.json
-
/measurements/main-measurement-v1.0.0.json
-> /measurements/main-v1.0.0.json
)vX.Y.Z
)The complexity that I see is that we have three levels:
<Collaboration>
-Analysis
<Collaboration>
-Analysis-Segments
"Collaboration-Unspecific"
-Analysis-Segments
Building on @jirikuncar's structure and @tiborsimko's suggestions we could have something like:
|- jsonschemas/
| |- definitions/
| | |- basic-metadata-v1.0.0.json ("Collaboration-Unspecific"-Analysis-Segments)
| |- records/
| | |- atlas/
| | | |- ATLASAnalysis-v0.0.1.json (<Collaboration>-Analysis)
| | | |- measurements/
| | | | |- main-v1.0.0.json (<Collaboration>-Analysis-Segments)
| | |- cms/
| | |- lhcb/
We should discuss whether this structure rips the "Collaboration-Unspecific"
-Analysis-Segments too far away from the other Analysis-related jsonschema definitions and how we might decrease this gap for better closure and intuitiveness.
@Kjili with only one comments/ATLASAnalysis/analysis/
.
Another possibility would probably go better with the options/
folder content and the meaning behind records/
:
|- jsonschemas/
| |- definitions/
| | |- basics/
| | | |- metadata-v1.0.0.json ("Collaboration-Unspecific"-Analysis-Segments)
| | |- atlas/
| | | |- measurements/
| | | | |- main-v1.0.0.json (<Collaboration>-Analysis-Segments)
| | |- cms/
| | |- lhcb/
| |- records/
| | |- ATLASAnalysis-v0.0.1.json (<Collaboration>-Analysis)
For my intuition this does not result in as wide a gap between the different segments and the main analysis schema.
It might not be the best solution though if we start having ATLASAnalysis, ATLASWorkflows, ATLASx, ATLASy, ... and the same for ALICE, CMS and LHCb. Unless we start atlas/
, alice/
, ... folders inside the records/
folder.
@Kjili keeping similar folder structure in records/
(atlas/
, alice/
, ...) seems more logical. Again, I would highly recommend using only lowercased filenames. It should be also preferable to use directories instead of file prefixes (e.g. ATLASAnalysis ->
atlas/analysis`).
While we are at it we should also change the file names themselves to represent the new naming scheme, e.g. CMSFinalInputCodeOutput
-> cms/final-results
or cms-final-results
respectively and remove the files that are outdated.
@tiborsimko what do you think?
@suenjedt I basically expressed my thoughts above. The most important question to me is whether the collaborations might want to use some names already based on their current vocabulary practices, or whether we are going to maintain these names ourselves. E.g. adage
schemas may already be used outside, so we may have less liberties to "compress" them into some standard form.
If we have all the liberties, I'd go for a very simple flat structure like <namespace>-foobar-v1.0.0.json
which seems more flexible than inventing some nested directory hierarchies, in case we decide to reorganise, move, or otherwise amend "measurements" and friends later. (This would be option 1 above.)
Closing this, since everybody seems to be happy with how things are right now.
since everybody seems to be happy with how things are right now.
That's a bit strong statement. There are good reasons behind good directory structure that will make it easier to create correct ES indices in the future.
I can reopen it then; it was closed after being discussed in today's meeting.
Creating/moving issue here for comment by @tiborsimko in PR #188 :
Shall we take this occasion and also clean names so that the full URI would be stable?
E.g. this PR introduces:
jsonschemas/records/CMSAnalysis-v0.0.1.json
which seems nicely "namespaced" to CMS, but also:
/jsonschemas/definitions/workflow_schemas/yadage/scheduler/parameterselection-v0.0.1.json
where the namespace is less clear from the name only (
parameterselection-v0.0.1.json
) and one has to rely on the preceding path. (The namespace could be "yadage" here, but it's kind of "less visible" in the middle.)Option 1: we could use flat directory structure and put namespaces in file names only:
Option 2: we could use directories and let experiments name the schemas as they see fit:
(I guess some file name prefix is nice to have, because if we always rely on directory location, then we may enter troubles when two files have the same names. It could be error-prone.)
Option 3: combine the above, and allow nested directories: