ESPRI-Mod / synda

ESGF Downloader (this is a deprecated repository, the tool has now moved to https://github.com/ESGF/esgf-download)
https://espri-mod.github.io/synda/
21 stars 11 forks source link

importing data from previously downloaded archive #170

Open oloapinivad opened 3 years ago

oloapinivad commented 3 years ago

Dear all,

I wonder if there is an approach that could be helpful for us: we have about 20TB of CMIP5/CMIP6 which has been downloaded before setting up synda. We are happy now using synda, but we were wondering if there is any way to make synda "read" those data or if the only approach is to download them again from scratch.

Thanks for any hint Paolo

painter1 commented 3 years ago

I did that three years ago. At the time, and probably now, there was no special tool for filling up the Synda database from any external information.

I did not try to fill in all columns in the Synda database, only the ones which were easy to include or which were necessary such as *_functional_id and status. How hard this will be for you depends greatly on what you are starting out with. I used both the file system, i.e. the DRS paths to each file, and my old Postgres database. I don't remember all the details of how I did it, although I have some notes on it interspersed with notes on other activities.

These are the columns which I set up from my old database to the Synda 'file' table: file_functional_id, local_path, filename, variable, status, size, dataset_id, last_mod_date. And to the Synda 'dataset' table: dataset_functional_id, version, status, template, local_path, status, local_path, last_mod_date. (last_mod_date can be basically any date which isn't too recent). This obviously wouldn't be sufficient for files which haven't been downloaded yet, as they at least need a checksum field too.