COSIMA / cosima-cookbook

Framework for indexing and querying ocean-sea ice model output.
https://cosima-recipes.readthedocs.io/en/latest/
Apache License 2.0
57 stars 27 forks source link

Cookbook database doesn't deal with non zero-padded output folder names #310

Closed rmholmes closed 1 year ago

rmholmes commented 1 year ago

I'm looking at Qian's future projections perturbation run in /g/data/ik11/outputs/access-om2-01/01deg_jra55v13_ryf9091_qian_wthmp. The output in this directory goes from output836 (year 2110) to output1035 (year 2160). The folder names are not zero padded. Because of this, it looks like the cookbook database only gives access to the last 10 years (output1000, year 2151 to output1035, year 2160).

Is there a solution to this, other than requiring all folder names to be zero-padded?

aekiss commented 1 year ago

Are you sure the problem is due to the folder names? I have no problem with 01deg_jra55v140_iaf_cycle4_jra55v150_extension, which runs from output992 to output1004 - see /g/data/ik11/outputs/access-om2-01/01deg_jra55v140_iaf_cycle4_jra55v150_extension.

rmholmes commented 1 year ago

@aekiss no I'm not sure. But I can't think of what else it would be. In a jupyter notebook:

PERT_SST = cc.querying.getvar('01deg_jra55v13_ryf9091_qian_wthmp', 'temp', session, frequency='1 monthly')
PERT_SST.time

Gives output:

array.DataArray'time'time: 120
array([cftime.DatetimeNoLeap(2150, 1, 16, 12, 0, 0, 0, has_year_zero=True),
...
       cftime.DatetimeNoLeap(2159, 12, 16, 12, 0, 0, 0, has_year_zero=True)],
      dtype=object)
...

Where as with ncview I can see the whole time period (although the ordering is not right).

AndyHoggANU commented 1 year ago

Is it possible that this could be related to the permission problems we had on these files leading to funky indexing? Can we force the database to re-index these files (ping @angus-g)? Or perhaps just make a temporary new database to test?

angus-g commented 1 year ago

Yeah I think it's from permissions problems, the earliest indexed date in that experiment is indeed 2150-01-01 (corresponding to output996). I think @micaeljtoliveira would be able to re-run the indexing to pick them up?

rmholmes commented 1 year ago

Oh ok. I thought the database update happened every night. That must be the explanation.

angus-g commented 1 year ago

I think it's meant to, but looks like it's been failing due to running out of walltime lately: https://accessdev.nci.org.au/jenkins/blue/organizations/jenkins/COSIMA%2FCC%20Database%20Build/activity/

micaeljtoliveira commented 1 year ago

Unfortunately the nightly indexing has been failing quite often, but not always because of running out of walltime. I need to spend some time looking into this.

rmholmes commented 1 year ago

Just confirming that this issue was with the indexing. I can now access all the data.

micaeljtoliveira commented 1 year ago

@rmholmes Great! Last week I finally managed to fix several issues with the indexing. Hopefully it will be a smoother process from now on.

rmholmes commented 1 year ago

Thanks @micaeljtoliveira!