ncar-xdev / ecgtools

ESM Catalog Generation tools
https://ecgtools.readthedocs.io
Apache License 2.0
9 stars 11 forks source link

Tool to combine catalogs into an archive #15

Open mnlevy1981 opened 4 years ago

mnlevy1981 commented 4 years ago

I think a YAML file like

experiment1:
  case1:
    catalog_path: [path to catalog]
    member_id: ###
  case2:
    catalog_path: [path to catalog]
    member_id: ###
experiment2:
  case3:
    catalog_path: [path to catalog]
    member_id: ###

would define the ensemble well enough. Each individual catalog would be read in, the experiment and member_id columns would be added, and the ctrl_case column would be replaced with ctrl_experiment and ctrl_member_id (assuming ctrl_case is also a member of the ensemble). Then all the individual catalogs would be concatenated into one giant file (we'd need to replace relative path names with absolute ones at this stage).

mnlevy1981 commented 4 years ago

@matt-long points out that we could do

[path to catalog1]:
  experiment: experiment1
  member_id: ###
[path to catalog2]:
  experiment: experiment1
  member_id: ###
[path to catalog3]:
  experiment: experiment2
  member_id: ###

Note that this requires us leaving case in when addressing NCAR/cesm-catalog#2

This leaves the individual catalogs we've created as the primary keys, which makes sense from a "building upon existing catalogs" viewpoint

mnlevy1981 commented 4 years ago

maybe above would need to allow something like

[path to catalog1]:
  experiment: experiment1
  member_id: ###
[path to catalog2]:
  experiment: experiment1
  member_id: ###
[path to catalog3]:
  experiment: experiment2
  member_id: ###
  ctrl_experiment: experiment1
  ctrl_member_id: ###

to override trying to explicitly determine the ctrl_ values

andersy005 commented 4 years ago

Is this approach assuming that the paths to catalogs are going to be unique?

matt-long commented 4 years ago

Yes.