cedadev / cis

Home of the Community Intercomparison Suite.
www.cistools.net
GNU Lesser General Public License v3.0
46 stars 18 forks source link

Possibly incorrect file signature #8

Closed adamcpovey closed 7 years ago

adamcpovey commented 7 years ago

Is https://github.com/cedadev/cis/blob/master/cis/data_io/products/CCI.py#L57 correct? Deleting the first period gives me the behaviour I'd expect, whilst at the moment my variables don't have coordinates.

duncanwp commented 7 years ago

It shouldn't matter, but it depends on your filename structure. The periods match any character, and the star then wildcards it, so as long as there is at least one character before the ESACCI... then it will match - and all CCI files I've come across start with a datestring.

What is the structure of the files you're trying to match? If you include a verbose (-v) flag when you run your command you should see which product CIS is choosing. You can also override this by manually specifying the Cloud_CCI product.

adamcpovey commented 7 years ago

The Cloud CCI data stored on CEDA uses the format, "ESACCI-L2-CLOUD-CLD-CC4CL_YYYYMMDDHHMM_fv.primary.nc", which is the second filename format acceptable in the CCI program (p.8 of cci.esa.int/sites/default/files/CCI_Data_Requirements_Iss1.2_Mar2015.pdf).

I'm trying to open the native outputs of ORAC, which are like the Cloud CCI format but less rigorously formatted. I got there with a new plugin that inherits the CCI class. I raised this issue as I couldn't see why you were requiring a character before "ESACCI". At the very least, wouldn't r'.+ESACCI.CLOUD.' be easier to read?

duncanwp commented 7 years ago

OK, that makes sense - I had only ever seen the first format.

r'.+ESACCI.CLOUD.' wouldn't work though because it would expect at least one character before ESACCI. r'.*ESACCI.CLOUD.' should work fine - as you pointed out at first. It would match the Aerosol CCI pattern then too.