coecms / clef

https://clef.readthedocs.io
7 stars 3 forks source link

Length of Datasets #147

Open alexborowiak opened 3 years ago

alexborowiak commented 3 years ago

Hi, It would be nice to be able to get the model run time of datasets. For the project I am working on now I need only extended length runs (e.g. models that have been run to 2300).

paolap commented 3 years ago

Hi Alex,

I'm tagging this as an enhancement, I'm assuming you mean having an extra query option at the command line. If you use clef by importing the modules in your code (https://clef.readthedocs.io/en/stable/code.html#examples) you can get 2 extra fields in your query results. fdate tdate and a True/False flag called time_complete that means that the files constitutes an "unbroken" time series.

df project institute model ... fdate tdate time_complete path ...
/g/data/al33/replicas/CMIP5/combined/MIROC/MIRO... CMIP5 MIROC MIROC5 ... 20060101 21001231 True

You could use the results to perform another query based on tdate >= 23000101 or something similar in your case.

I will look at ways to offer this on the command line, we discussed this previously but the CMIP5 data is so irregular that it introduced quite a few exceptions. Thanks for your suggestion