ESGF / esg-publisher

ESGF Publisher
http://esg-publisher.readthedocs.org/
10 stars 22 forks source link

Several esg.ini to publish multiple project leads to publication mistakes #8

Open glevava opened 9 years ago

glevava commented 9 years ago

Using one esg.ini per project could be a good point to gain clarity and security using publication or unpublication process. Which is enabled by the -i option of the publisher.

Unfortunately this leads to mistakes in the thredds management.

For example each project is published in its thredds_root define in its corresponding /esg/config/esgcet/esg.xxxx.ini, example below for esg.pmip3.ini : thredds_root = /esg/content/thredds/pmip3

Moreover, all thredds_dataset_roots are declared in all esg.xxxx.ini, otherwise the master catalog cannot be created :

    euclipse      | /prodigfs/esg/EUCLIPSE
    geomip        | /prodigfs/esg/GeoMIP
    cordex        | /prodigfs/esg/CORDEX
    tamip         | /prodigfs/esg/TAMIP
    pmip3         | /prodigfs/esg/PMIP3

Then:

  1. I publish EUCLIPSE data
esgpublish -i /esg/config/esgcet/esg.euclipse.ini --project euclipse --thredds --service fileservice --map mapfile_euclipse.txt
esgpublish -i /esg/config/esgcet/esg.euclipse.ini --project euclipse --publish --noscan --map mapfile_euclipse.txt

=> thredds xml catalog are correctly generated in /esg/content/thredds/euclipse

  1. Then I publish PMIP3 data
esgpublish -i /esg/config/esgcet/esg.pmip3.ini --project pmip3 --thredds --service fileservice --map mapfile_pmip3.txt
esgpublish -i /esg/config/esgcet/esg.pmip3.ini --project pmip3 --publish --noscan --map mapfile_pmip3.txt

=> thredds xml catalog are correctly generated in /esg/content/thredds/pmip3 _BUT_ the /esg/content/thredds/euclipse/catalog.xml content _ONLY_ is copy-paste to the /esg/content/thredds/pmip3/catalog.xml (i.e., dataset xml are not copy from thredds/pmip3 to thredds/euclipse)

  1. Consequently, EUCLIPSE XML entries in PMIP3 catalog leads to Server Error 500.
  2. I clean the PMIP3 catalog by
esgunpublish --skip-index --map EUCLIPE_dataset_to_delete.map

Results are the same using the corresponding esginitialize -c -i esg.xxxx.ini before each step.

hramthun commented 9 years ago

What is the solution to this problem? What is the recommandation?

I want to publsih several different projects into different 'directories': pmip3: /esg/content/thredds/pmip3 euclipse: /esg/content/thredds/euclipse ... project4711: /esg/content/thredds/project4711

So the access is different for each project: my_server/thredds/pmip3/catalog.html my_server/thredds/euclipse/catalog.html ... my_server/thredds/project4711/catalog.html

instead of accessing all projects in one flat list with: myserver_url/thredds/esgcet/catalog.html

sashakames commented 9 years ago

I agree that the implementation as it stands has problems. Even with a single catalog, changes made to esg.ini are too ephemeral, while the design intended them to be persistent. However, that requires careful maintenance on the part of the admin.

I think that a meaningful solution probably requires that the database schema be updated to reflect the different catalogs so the publisher can properly switch between each.