NCPP / ocgis

OpenClimateGIS is a set of geoprocessing and calculation tools for CF-compliant climate datasets.
Other
70 stars 19 forks source link

Multiple time units for MFDatasets #435

Closed huard closed 7 years ago

huard commented 7 years ago

This bug affects time indexing when the following conditions are met:

What happens is that ocgis does not recognize that the time units are different for each file, and is not able to group data over a period (for example an average from 2030 to 2050).

netcdf4 has a MFTime class that handles this case, but it does not appear to be used in ocgis.

bekozi commented 7 years ago

netcdf4 has a MFTime class that handles this case, but it does not appear to be used in ocgis.

Correct. I'll post once I've looked into it a bit more.

bekozi commented 7 years ago

Note: MFTime will have the same format limitations as MFDataset.

======================================================================
ERROR: test_netCDF4_MFTime (ocgis.test.test_simple.test_dependencies.TestDependencies)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/benkoziol/l/ocgis/src/ocgis/test/test_simple/test_dependencies.py", line 41, in test_netCDF4_MFTime
    mfd = MFDataset(paths)
  File "netCDF4/_netCDF4.pyx", line 5444, in netCDF4._netCDF4.MFDataset.__init__ (netCDF4/_netCDF4.c:64536)
ValueError: MFNetCDF4 only works with NETCDF3_* and NETCDF4_CLASSIC formatted files, not NETCDF4

----------------------------------------------------------------------
bekozi commented 7 years ago

@huard: See commit (https://github.com/NCPP/ocgis/commit/e978a651e698a055540e0d502ffc9c22aca51642) introduing MFTime. When you have time to test, let me know how it goes.

bekozi commented 7 years ago

Fixed in next and v-2.0.0.dev1.

huard commented 7 years ago

No go. MFTime is instantiated on time_bnds instead (or as well as) time. Since time_bnds has no calendar attribute, MFTime raises an error:

netCDF4/_netCDF4.pyx in netCDF4._netCDF4.MFTime.__init__ (netCDF4/_netCDF4.c:71210)()

ValueError: MFTime requires that the time variable in all files have a calendar attribute

Tested on pr_Amon_GFDL-ESM2M_rcp45_r1i1p1_200601-201012.nc, pr_Amon_GFDL-ESM2M_rcp45_r1i1p1_200601-201012.nc.

bekozi commented 7 years ago

@huard If it's not too much trouble, could you pass along the metadata dumps for the two files?

huard commented 7 years ago
netcdf pr_Amon_GFDL-ESM2M_rcp45_r1i1p1_200601-201012 {
dimensions:
    time = UNLIMITED ; // (60 currently)
    lat = 90 ;
    lon = 144 ;
    bnds = 2 ;
variables:
    double average_DT(time) ;
        average_DT:long_name = "Length of average period" ;
        average_DT:units = "days" ;
    double average_T1(time) ;
        average_T1:long_name = "Start time for average period" ;
        average_T1:units = "days since 2006-01-01 00:00:00" ;
    double average_T2(time) ;
        average_T2:long_name = "End time for average period" ;
        average_T2:units = "days since 2006-01-01 00:00:00" ;
    double lat(lat) ;
        lat:long_name = "latitude" ;
        lat:units = "degrees_north" ;
        lat:standard_name = "latitude" ;
        lat:axis = "Y" ;
        lat:bounds = "lat_bnds" ;
    double lon(lon) ;
        lon:long_name = "longitude" ;
        lon:units = "degrees_east" ;
        lon:standard_name = "longitude" ;
        lon:axis = "X" ;
        lon:bounds = "lon_bnds" ;
    double bnds(bnds) ;
        bnds:long_name = "vertex number" ;
        bnds:cartesian_axis = "N" ;
    float pr(time, lat, lon) ;
        pr:long_name = "Precipitation" ;
        pr:units = "kg m-2 s-1" ;
        pr:cell_methods = "time: mean" ;
        pr:interp_method = "conserve_order1" ;
        pr:missing_value = 1.e+20f ;
        pr:_FillValue = 1.e+20f ;
        pr:standard_name = "precipitation_flux" ;
        pr:original_units = "kg/m2/s" ;
        pr:original_name = "precip" ;
        pr:cell_measures = "area: areacella" ;
        pr:associated_files = "baseURL: http://cmip-pcmdi.llnl.gov/CMIP5/dataLocation areacella: areacella_fx_GFDL-ESM2M_rcp45_r0i0p0.nc" ;
    double time(time) ;
        time:long_name = "time" ;
        time:units = "days since 2006-01-01 00:00:00" ;
        time:cartesian_axis = "T" ;
        time:calendar_type = "noleap" ;
        time:calendar = "noleap" ;
        time:bounds = "time_bnds" ;
        time:standard_name = "time" ;
        time:axis = "T" ;
    double time_bnds(time, bnds) ;
        time_bnds:long_name = "time axis boundaries" ;
        time_bnds:units = "days since 2006-01-01 00:00:00" ;
    double lat_bnds(lat, bnds) ;
    double lon_bnds(lon, bnds) ;

// global attributes:
        :title = "NOAA GFDL GFDL-ESM2M, RCP4.5 (run 1) experiment output for CMIP5 AR5" ;
        :institute_id = "NOAA GFDL" ;
        :source = "GFDL-ESM2M 2010 ocean: MOM4 (MOM4p1_x1_Z50_cCM2M,Tripolar360x200L50); atmosphere: AM2 (AM2p14,M45L24); sea ice: SIS (SISp2,Tripolar360x200L50); land: LM3 (LM3p7_cESM,M45)" ;
        :contact = "gfdl.climate.model.info@noaa.gov" ;
        :project_id = "CMIP5" ;
        :table_id = "Table Amon (31 Jan 2011)" ;
        :experiment_id = "rcp45" ;
        :realization = 1 ;
        :modeling_realm = "atmos" ;
        :tracking_id = "ca6e0315-c881-4063-9f7c-751c9d7426ea" ;
        :Conventions = "CF-1.4" ;
        :references = "The GFDL Data Portal (http://nomads.gfdl.noaa.gov/) provides access to NOAA/GFDL\'s publicly available model input and output data sets. From this web site one can view and download data sets and documentation, including those related to the GFDL coupled models experiments run for the IPCC\'s 5th Assessment Report and the US CCSP." ;
        :comment = "GFDL experiment name = ESM2M-HC1_2006-2100_all_rcp45_XC1. PCMDI experiment name = RCP4.5 (run1). Initial conditions for this experiment were taken from 1 January 2006 of the parent experiment, ESM2M-C1_all_historical_HC1 (historical). Several forcing agents varied during the 95 year duration of the RCP4.5 experiment based upon the MINICAM integrated assessment model for the 21st century. The time-varying forcing agents include the well-mixed greenhouse gases (CO2, CH4, N2O, halons), tropospheric and stratospheric O3, model-derived aerosol concentrations (sulfate, black and organic carbon, sea salt and dust), and land use transitions. Volcanic aerosols were zero and solar irradiance varied seasonally based upon late 20th century averages but with no interannual variation. The direct effect of tropospheric aerosols is calculated by the model, but not the indirect effects." ;
        :gfdl_experiment_name = "ESM2M-HC1_2006-2100_all_rcp45_XC1" ;
        :creation_date = "2011-08-12T01:56:03Z" ;
        :model_id = "GFDL-ESM2M" ;
        :branch_time = "52925" ;
        :experiment = "RCP4.5" ;
        :forcing = "GHG,SD,Oz,LU,SS,BC,MD,OC (GHG includes CO2, CH4, N2O, CFC11, CFC12, HCFC22, CFC113)" ;
        :frequency = "mon" ;
        :initialization_method = 1 ;
        :parent_experiment_id = "historical" ;
        :physics_version = 1 ;
        :product = "output1" ;
        :institution = "NOAA GFDL(201 Forrestal Rd, Princeton, NJ, 08540)" ;
        :history = "File was processed by fremetar (GFDL analog of CMOR). TripleID: [exper_id_K92MrW6Oa4,realiz_id_GX4D0HOU9Z,run_id_y4SDyPhdCz]" ;
        :parent_experiment_rip = "r1i1p1" ;
huard commented 7 years ago

It wasn't clear, but except for the tracking_id and creation date, the metadata is identical for both files.

bekozi commented 7 years ago

Ha, I figured. Always good to check though. Thanks for passing the metadata along.

I think the correct approach is to have the bounds variable inherit the calendar from its parent variable. Interesting that they listed the units on the time bounds and not the calendar. The calendar is also non-standard in this file.

bekozi commented 7 years ago

This data does present a problem. The calendar is already inherited from the parent for bounds variables in ocgis, but MFTime operates from-file so there is no opportunity to intercept the metadata. There are two options as I see it:

  1. Add the appropriate attribute to the source data.
  2. Use the dimension_map to ignore the time bounds altogether:
import ocgis

rd = ocgis.RequestDataset(paths)
rd.dimension_map.set_bounds(ocgis.constants.DimensionMapKey.TIME, None)
ops = ocgis.OcgOperations(dataset=rd, ...)

What do you think?

huard commented 7 years ago
  1. Is out in this case, as the source files are read-only.
  2. I'm not familiar enough with ocgis to have an informed opinion.

How would time subsetting work then if time_bnds is not available ? It would use time instead?

I've tried your proposal but I'm missing something. rd.dimension_map is a dict without a set_bounds method.

On Fri, Apr 28, 2017 at 12:40 PM Ben Koziol notifications@github.com wrote:

This data does present a problem. The calendar is already inherited from the parent for bounds variables in ocgis, but MFTime operates from-file so there is no opportunity to intercept the metadata. There are two options as I see it:

  1. Add the appropriate attribute to the source data.
  2. Use the dimension_map to ignore the time bounds altogether:

import ocgis

rd = ocgis.RequestDataset(paths) rd.dimension_map.set_bounds(ocgis.constants.DimensionMapKey.TIME, None) ops = ocgis.OcgOperations(dataset=rd, ...)

What do you think?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/NCPP/ocgis/issues/435#issuecomment-298047217, or mute the thread https://github.com/notifications/unsubscribe-auth/AAE9Q3P_wE78de1gmg6DjNTjYbMyyelrks5r0haJgaJpZM4LsslO .

bekozi commented 7 years ago

How would time subsetting work then if time_bnds is not available ? It would use time instead?

Yes. time_bnds will still be sliced, but it won't be used for time subsetting. Only the time centroids in time will be used.

I've tried your proposal but I'm missing something. rd.dimension_map is a dict without a set_bounds method.

The dimension map was just recently objectified. You'll need to pull and re-install. Sorry, I should have mentioned that.

huard commented 7 years ago

Seems to work. Thanks !