Open LucaCinquini opened 10 years ago
Suggested course of action:
a) Code the AccessLoggingFilter defensively, so that if a mounting point is not found, the download is not interrupted, but rather the original URL is used for logging purposes
b) Insert the capability of reading the mounting point from a more general location ("esgf_disk_mounts.xml") instead of the esg.ini file
Both good suggestions... I think both are worth doing. As long as we keep an eye on the security posture... but these suggestions look that they should be legit. Let's see I can find time to code it or have a team member do the honors.
The AccessLoggingFilter will crash, and cause the data download to fail, if the requested file URL cannot be resolved to a know disk mounting point. For example:
ACCESS LOGGING FILTER ERROR
AccessLoggingFilter.resolveUrlToFile(http://esg-datanode.jpl.nasa.gov/thredds/fileServer/gass-ytoc-mip/04_GISS_ModelE2/expt2/zg/ModelE.zg.20100110.00Z.nc) Group 3 = /gass-ytoc-mip/04_GISS_ModelE2/expt2/zg/ModelE.zg.20100110.00Z.nc Resolving /gass-ytoc-mip/04_GISS_ModelE2/expt2/zg/ModelE.zg.20100110.00Z.nc Scanning over [1] mounts -Resolved to local path: [null] Mountpoint transformation of url path: [http://esg-datanode.jpl.nasa.gov/thredds/fileServer/gass-ytoc-mip/04_GISS_ModelE2/expt2/zg/ModelE.zg.20100110.00Z.nc] -to-> [null]
The filter assumes that all mounting points are listed in the file /esg/config/esgcet/esg.ini as for example:
thredds_dataset_roots = esg_dataroot | /esg/archive gass-ytoc-mip | /davarchive/data/archive
The problem is that not all data published to ESGF uses the esg.ini file:
a) The TDS utility can publish data automatically, if it is placed in a specifically configured directory
b) The new ESGF publishing services in "PUSH" mode can simply send any valid XML files to the Index Node, without the need for using the traditional ESGF publisher and esg.ini file