azavea / climate-change-api

Apache License 2.0
0 stars 0 forks source link

Importing dates in noleap calendars skip February 29th #87

Closed rmartz closed 7 years ago

rmartz commented 8 years ago

We the netCDF4 library to convert dates from a variety of types of calendars to dates Python can understand. Some of the calendars are Gregorian, but the majority are a fixed-length "noleap" format, where every year is always 365 days long.

NetCDF4 converts these dates to Python as the date the value represents in its native calendar, but not the moment in time that matches it. For instance, 18674 days from 2005-01-01 noleap would be March 1st, 2056 in the noleap calendar, but using a Gregorian calendar would be February 17th, 2056.

This causes problems because 2056 is a leap year, but there is no way for netCDF4 to produce a date for February 29th, 2056 from a noleap calendar:

18673.5 days since 2005-01-01 00:00:00 noleap
 => converts to 2056-02-28 as a Python date
18674.5 days since 2005-01-01 00:00:00 noleap
 => converts to 2056-03-01 as a Python date

If we fill in the gaps to have the dates represent the same moment in time and not the same calendar date, this will cause another problem with years: Dates early in one year will actually belong to the prior year, but are hard-coded to belong to the year they were imported for in the ClimateDataSource object. We will need to stop using the year from that object and instead use the actual date in the ClimateData object. We'll also need to track the actual date in the ClimateData object because right now we track day_of_year.

CloudNiner commented 8 years ago

Potentially reach out to Rawlings Miller: Rawlings.Miller@icfi.com

http://www.icfi.com/about/our-people/icf/m/miller-rawlings

rmartz commented 8 years ago

Useful reference on which models use which calendar system Definitions for the calendar models used by the climate models

CloudNiner commented 8 years ago

After a review of this, with @rmartz 's help we've determined that we can safely move forward on the assumption that the dates in the files represent a calendar date, and not a moment in time. This requires no further modifications to our importer or database structure.

We looked at the CF Conventions "Time" section, along with how units are defined in the udunits package. These references state that both the gregorian year and 365_day calendar systems are exact multiples of the base unit 'second' rather than references to 'year', defined as 'Interval between 2 successive passages of sun through vernal equinox'. This me to believe that we were ok, but we wanted to be test.

So, we decided to experimentally test our theory by plotting tasmax for year 2100 at a single point (City #9 - Phoenix AZ). We plotted the average of all the 365_day models vs the average of all the gregorian models, using this reference. If there was an offset due to these calendars referring to moments in time, we would expect to see a phase shift in the plot. Additionally, to reduce noise, we used a 7-day average on sheet 2 for each calendar. The '365_day shifted' column is a manually shifted version of the 365_day column to demonstrate the phase shift effect we would expect to see in the data if changes needed to be made.

https://docs.google.com/spreadsheets/d/1Jzz_gz1tdGcRJgl-NMkQynWizMOv4kUFsO0MyxvXr70/edit#gid=0

Query for 365_day calendar models: https://staging.api.futurefeelslike.com/api/climate-data/9/RCP85/?years=2100&models=BNU-ESM,CCSM4,CESM1-BGC,CSIRO-Mk3-6-0,CanESM2,GFDL-CM3,GFDL-ESM2G,GFDL-ESM2M,IPSL-CM5A-LR,IPSL-CM5A-MR,NorESM1-M,inmcm4,bcc-csm1-1

Query for Gregorian calendar models: https://staging.api.futurefeelslike.com/api/climate-data/9/RCP85/?years=2100&models=ACCESS1-0,CNRM-CM5,MIROC-ESM-CHEM,MIROC-ESM,MPI-ESM-LR,MPI-ESM-MR,MRI-CGCM3

CloudNiner commented 7 years ago

Closing this, its resolved as above and we haven't had any further issues.