Closed chengzhuzhang closed 2 years ago
I'm trying to figure out where I should "pull" this material. I guess I can pull to the e3sm/staging/resource location. Should there be a separate metadata files per experiment? Per ensemble? Or per dataset (one for atmos mon, one for atmos day, one for ocean mon, etc)?
Hey Tony, I think you already has a directory set up for this directory. metadata files are per experiment and per ensemble, for datasets from all realms.
Well, I'm sitting in /p/user_pub/e3sm/staging/resource/CMIP6-metadata/. It contains:
drwxrwxr-x. 2 bartoletti1 publishers 4096 May 20 10:14 E3SM-1-0 drwxrwxr-x. 2 bartoletti1 publishers 4096 Apr 19 13:38 E3SM-1-1 drwxrwxr-x. 2 bartoletti1 publishers 4096 Apr 19 13:38 E3SM-1-1-ECA -rw-rw-r--. 1 bartoletti1 publishers 234 Apr 19 13:38 README.md -rw-rw-r--. 1 bartoletti1 publishers 2705 Apr 19 13:38 template.json drwxrwxr-x. 2 bartoletti1 publishers 4096 Apr 19 13:38 test
I did a "git pull" and it said
remote: Enumerating objects: 11, done. remote: Counting objects: 100% (11/11), done. remote: Compressing objects: 100% (3/3), done. remote: Total 5 (delta 2), reused 5 (delta 2), pack-reused 0 Unpacking objects: 100% (5/5), 747 bytes | 19.00 KiB/s, done. From https://github.com/E3SM-Project/CMIP6-Metadata
But I would expect to see a "E3SM-2-0" directory. Perhaps I should blow it all away and do a "git clone"?
Also: (base) -bash-4.2$ git branch
My bad.. I failed to add the new file. Please pull again..
What should I do when "git pull" says
From https://github.com/E3SM-Project/CMIP6-Metadata 5eb5a6f..eb586c2 add_v2 -> origin/add_v2 There is no tracking information for the current branch. Please specify which branch you want to merge with. See git-pull(1) for details.
git pull <remote> <branch>
(what does it want for "remote" and for "branch". I assume "add_v2" is the branch...
Not sure what happened. Try git fetch
again?
This is doing something:
git pull https://github.com/E3SM-Project/CMIP6-Metadata add_v2
says
From https://github.com/E3SM-Project/CMIP6-Metadata
- branch add_v2 -> FETCH_HEAD Updating 618f8de..eb586c2
Finally - there is E3SM-2-0/historical_r1i1p1f1.json
When I list the 368 "phase 1" v2 dataset_ids, cut to just the "experiment + ensemble" fields (and sort/uniq), I get 34 results:
1pctCO2.ens1 abrupt-4xCO2.ens1 abrupt-4xCO2.ens2 amip.ens1 amip.ens2 amip.ens3 hist-aer.ens1 hist-aer.ens2 hist-aer.ens3 hist-aer.ens4 hist-aer.ens5 hist-all-xGHG-xaer.ens1 hist-all-xGHG-xaer.ens2 hist-all-xGHG-xaer.ens3 hist-all-xGHG-xaer.ens4 hist-all-xGHG-xaer.ens5 hist-GHG.ens1 hist-GHG.ens2 hist-GHG.ens3 hist-GHG.ens4 hist-GHG.ens5 historical.ens1 historical.ens2 historical.ens3 historical.ens4 historical.ens5 piClim-control.ens1 piClim-histaer.ens1 piClim-histaer.ens2 piClim-histaer.ens3 piClim-histall.ens1 piClim-histall.ens2 piClim-histall.ens3 piControl.ens1
I assume I need to convert (say) "piClim-histaer.ens3: to "piClim-histaer_r3i1p1f1", etc. to name the 34 metadata files.
I guess it is not necessary to rename, as long as feeding the correct name to e3sm_to_cmip hist-all-xGHG-xaer.* can be left out since we won't publish.
Forgot about hist-all-xGHG-xaer, thanks. Renaming is not the issue - I need to discover what elements of the metadata are specific to each experiment/ensemble. I'll get there.
OK - we now have 29 differently-named v2 metadata files. I checked on the v1 historical files to see how they differ across ensembles:
(base) -bash-4.2$ diff historical_r1i1p1f1.json historical_r2i1p1f1.json 14c14 < "realization_index": "1",
"realization_index": "2", 40c40 < "branch_time_in_parent": 36500.0,
"branch_time_in_parent": 54750.0, (base) -bash-4.2$ (base) -bash-4.2$ (base) -bash-4.2$ diff historical_r1i1p1f1.json historical_r3i1p1f1.json 14c14 < "realization_index": "1",
"realization_index": "3", 40c40 < "branch_time_in_parent": 36500.0,
"branch_time_in_parent": 73000.0, (base) -bash-4.2$ (base) -bash-4.2$ (base) -bash-4.2$ diff historical_r1i1p1f1.json historical_r4i1p1f1.json 14c14 < "realization_index": "1",
"realization_index": "4", 40c40 < "branch_time_in_parent": 36500.0,
"branch_time_in_parent": 91250.0, (base) -bash-4.2$ (base) -bash-4.2$ (base) -bash-4.2$ diff historical_r1i1p1f1.json historical_r5i1p1f1.json 14c14 < "realization_index": "1",
"realization_index": "5", 40c40 < "branch_time_in_parent": 36500.0,
"branch_time_in_parent": 109500.0,
So other than "realization_index", they have different (and increasing) "branch_time_in_parent" values. How should I accommodate this in the v2 sets?
Next, I compared the v1 historical to the v1 piControl:
(base) -bash-4.2$ diff historical_r1i1p1f1.json piControl_r1i1p1f1.json 8c8 < "experiment_id": "historical",
"experiment_id": "piControl", 26c26 < "parent_experiment_id": "piControl",
"parent_experiment_id": "piControl-spinup", 40c40 < "branch_time_in_parent": 36500.0,
"branch_time_in_parent": 0.0, 70c70 < "history": "",
"history": "Output from 20180129.DECKv1b_piControl.ne30_oEC.edison. compset = A_WCYCL1850S_CMIP6", 72c72 < "comment": "",
"comment": " piControl was configured to adhere as closely as possible with the CMIP6 DECK specifications (Eyring at al. 2016, GMD) with prescribed forcings appropriate for 1850 conditions. The simulation was run with time invariant forcings for a total of 500 years. To reduce spin-up time of the deep ocean, the simulation was initialized from a series of pre-existing control simulations performed with developmental versions of E3SM v1 as part of the final tuning phase (approximately 400 years). The final model tuning consisted of minor adjustments to cloud parameters with the objectives of achieving (1) near zero net top-of-atmosphere (TOA) radiation balance and (2) stable global mean surface air temperature.",
SOMEWHERE there needs to be a "metadata_guide" document that explains exactly which metadata fields vary ONLY with ensemble, which vary with experiment, which vary with simulation model, and which remain constant.
Fixed all historical (realization_index and branch_time_in_parent, and history) and fixed experiment_id in all hist-aer.
Sorry for the oversight. I thought I'd caught everything. I wrote a small "check_values.sh" to make it easier to spot issues.
Shall I merge?
Not yet, would you please revert the license
to old v1 license for v1 material? so Undo things in this commit update license info for all v1 user metadata
This will be tricky - I need to revert the "add_v2" branch to a previous state (for v1 stuff) and not lose all the v2 edits.
I'll see what I can do - I need to revert to obtain the old license info.
(I'm sure there's a git-way to revert specific files.)
I think just to add a new commit to change license texts if that's easier.
If you have the old v1 files, that would be easier. If you can copy the "old" E3SM-1-* metadata files to their respective locations under e3sm/staging/resource/CMIP6-metadata, I can simply add them to a commit. But if you have an open repo with teh old v1 files, that would be fine to just re-commit them yourself, and upon merge, resolve the conflicts by taking your v1 files.
I see now that I should have created a separate branch for "update_v1_licenses", rather than mix that up with the "add_v2" (metadata).
Alternately, I can re-edit the v1 files, but I need a copy of the old license info.
My recent commit history:
commit 6ee6556443b5aa60a710223c55796cc145068b40 (HEAD -> add_v2, origin/add_v2) Author: Tony Bartoletti bartoletti1@llnl.gov Date: Fri Jul 29 11:37:56 2022 -0700
fixed misfire on realization_index
commit 62a8d627b0d4566ec167b2bda212e22ea3182737 Author: Tony Bartoletti bartoletti1@llnl.gov Date: Fri Jul 29 11:12:20 2022 -0700
fix overlooked values in historical and hist-aer metadata
commit 7fc5f2ccdf3cfa18c92727fc84d240aa7976dfb1 Author: Tony Bartoletti bartoletti1@llnl.gov Date: Thu Jul 28 14:41:39 2022 -0700
update license info for all v1 user metadata
I guess I can always find the old license info in old published files.
I think there is a way to "git stash" all current state, revert everything to "pre-v1-license edits", and then "git stash pop" only selected files (the v2 files). I'll investigate.
This PR is superseded by #9.
Hey Tony @TonyB9000, I just added a template for v2, would you please follow this to populate meta definition files for other v2 simulations? The experiment_id and activity mapping can be found https://wcrp-cmip.github.io/CMIP6_CVs/docs/CMIP6_experiment_id.html