openmethane / openmethane-prior

Method to calculate a gridded, prior emissions estimate for methane across Australia
Apache License 2.0
4 stars 0 forks source link

Migrate to ADS-beta #46

Closed aethr closed 1 month ago

aethr commented 1 month ago

Description

Resolves #19 .

The Copernicus CDS and ADS systems are being migrated to a new architecture, and all clients must update to use the new "ADS beta" and "CDS beta" before the legacy system is decomissioned on 26/09/2024.

This change in the code is accompanied by a new account being created, and the CDSAPI_ADS_KEY Organisation secret in Github Actions being replaced with new credentials.

Checklist

Please confirm that this pull request has done the following:

Notes

aethr commented 1 month ago

It appears the GFAS NetCDF file has changed somewhat in the new API. Where previously it appeared to be NetCDF3 generated by grib_to_netcdf-2.25.1, it is now NetCDF4 generated by GRIB to CDM+CF via cfgrib-0.9.14.0/ecCodes-2.36.0.

This includes a structural change to the time attribute, which is now valid_time.

Original gfas-download.nc (in Panoply):

netcdf ../gfas-download.legacy.nc {
  dimensions:
    longitude = 3600;
    latitude = 1800;
    time = 1;
  variables:
    float longitude(longitude=3600);
      :units = "degrees_east";
      :long_name = "longitude";

    float latitude(latitude=1800);
      :units = "degrees_north";
      :long_name = "latitude";

    int time(time=1);
      :units = "hours since 1900-01-01 00:00:00.0";
      :long_name = "time";
      :calendar = "gregorian";

    short ch4fire(time=1, latitude=1800, longitude=3600);
      :scale_factor = 5.874831290468975E-13; // double
      :add_offset = 1.924947220635064E-8; // double
      :_FillValue = -32767S; // short
      :missing_value = -32767S; // short
      :units = "kg m**-2 s**-1";
      :long_name = "Wildfire flux of Methane";

  // global attributes:
  :Conventions = "CF-1.6";
  :history = "2024-09-16 15:23:15 GMT by grib_to_netcdf-2.25.1: /opt/ecmwf/mars-client/bin/grib_to_netcdf.bin -S param -o /cache/data8/adaptor.mars_constrained.internal-1726500194.713318-15307-14-8af58a44-8084-407d-91a9-afada1eaf3d9.nc /cache/tmp/8af58a44-8084-407d-91a9-afada1eaf3d9-adaptor.mars_constrained.internal-1726500193.5876522-15307-24-tmp.grib";
}

New gfas-download.nc from ADS-beta:

netcdf ../gfas-download.beta.nc {
  dimensions:
    latitude = 1800;
    longitude = 3600;
    valid_time = 1;
  variables:
    double latitude(latitude=1800);
      :_FillValue = NaN; // double
      :units = "degrees_north";
      :standard_name = "latitude";
      :long_name = "latitude";
      :stored_direction = "decreasing";

    float ch4fire(valid_time=1, latitude=1800, longitude=3600);
      :_FillValue = NaNf; // float
      :GRIB_paramId = 210082L; // long
      :GRIB_dataType = "ga";
      :GRIB_numberOfPoints = 6480000L; // long
      :GRIB_typeOfLevel = "surface";
      :GRIB_stepUnits = 1L; // long
      :GRIB_stepType = "avg";
      :GRIB_gridType = "regular_ll";
      :GRIB_uvRelativeToGrid = 0L; // long
      :GRIB_NV = 0L; // long
      :GRIB_Nx = 3600L; // long
      :GRIB_Ny = 1800L; // long
      :GRIB_cfName = "unknown";
      :GRIB_cfVarName = "ch4fire";
      :GRIB_gridDefinitionDescription = "Latitude/Longitude Grid";
      :GRIB_iDirectionIncrementInDegrees = 0.1; // double
      :GRIB_iScansNegatively = 0L; // long
      :GRIB_jDirectionIncrementInDegrees = 0.1; // double
      :GRIB_jPointsAreConsecutive = 0L; // long
      :GRIB_jScansPositively = 0L; // long
      :GRIB_latitudeOfFirstGridPointInDegrees = 89.95; // double
      :GRIB_latitudeOfLastGridPointInDegrees = -89.95; // double
      :GRIB_longitudeOfFirstGridPointInDegrees = 0.05; // double
      :GRIB_longitudeOfLastGridPointInDegrees = 359.95; // double
      :GRIB_missingValue = 3.4028234663852886E38; // double
      :GRIB_name = "Wildfire flux of Methane";
      :GRIB_shortName = "ch4fire";
      :GRIB_totalNumber = 0L; // long
      :GRIB_units = "kg m**-2 s**-1";
      :long_name = "Wildfire flux of Methane";
      :units = "kg m**-2 s**-1";
      :standard_name = "unknown";
      :GRIB_number = 0L; // long
      :GRIB_surface = 0.0; // double
      :coordinates = "valid_time latitude longitude";
      :_ChunkSizes = 1U, 57U, 225U; // uint

    double longitude(longitude=3600);
      :_FillValue = NaN; // double
      :units = "degrees_east";
      :standard_name = "longitude";
      :long_name = "longitude";

    long valid_time(valid_time=1);
      :standard_name = "time";
      :units = "seconds since 1970-01-01";
      :calendar = "proleptic_gregorian";
      :long_name = "time";

  // global attributes:
  :GRIB_centre = "ecmf";
  :GRIB_centreDescription = "European Centre for Medium-Range Weather Forecasts";
  :GRIB_subCentre = 0L; // long
  :Conventions = "CF-1.7";
  :institution = "European Centre for Medium-Range Weather Forecasts";
  :history = "2024-09-17T00:33 GRIB to CDM+CF via cfgrib-0.9.14.0/ecCodes-2.36.0 with {\"source\": \"data.grib\", \"filter_by_keys\": {\"typeOfLevel\": \"surface\"}, \"encode_cf\": [\"parameter\", \"time\", \"geography\", \"vertical\"]}";
}

Note the valid_time variable has changed to :units = "seconds since 1970-01-01"; where the original time variable was :units = "hours since 1900-01-01 00:00:00.0";.

aethr commented 1 month ago

The change in format from time to valid_time isn't a problem itself, as we use nc.num2date(ncin.variables["time"][:], ncin.variables["time"].getncattr("units")) to convert values to a universal format. However, it appears there might be a bug in CDS itself, as the valid_time values seem to be offset from what's actually being requested.

Using the legacy API, fetching data for a period from 2023-01-01 to 2023-01-02:

$ poetry run python src/openmethane_prior/layers/omGFASEmis.py --start-date 2023-01-01 --end-date 2023-01-02

CAMS request:

c.retrieve(
        "cams-global-fire-emissions-gfas",
        {
            "date": "2023-01-01/2023-01-02",
            "format": "netcdf",
            "variable": [
                "wildfire_flux_of_methane",
            ],
        },
        file_name,
    )

Fetched date range is correct, going from the 2023-01-01 to 2023-01-02:

[cftime.DatetimeGregorian(2023, 1, 1, 0, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2023, 1, 2, 0, 0, 0, 0, has_year_zero=False)]

Running the same command but using the new API, generates the same CDS request, but the valid_time range ends up as:

[cftime.DatetimeGregorian(2023, 1, 2, 0, 0, 0, 0, has_year_zero=False)
 cftime.DatetimeGregorian(2023, 1, 3, 0, 0, 0, 0, has_year_zero=False)]
aethr commented 1 month ago

I've raised a support request about the off-by-one bug in the ADS beta: https://jira.ecmwf.int/plugins/servlet/desk/portal/1/CUS-26007

prayner commented 1 month ago

Well that's unsociable of them :-) Two options I see here: 1) Switch to using the netCDF4 module completely. I think we already include it in the openmethane container since we use a mixture of xarray and netCDF4 already 2) Use one of the grib conversion tools and go with the grib version. Probably more reliable long term but more work short term. Discuss at 4?

aethr commented 1 month ago

I've made some fixes that allowed me to re-enable the test_gfas_emis test, which is now PASSING. However, I'm not familiar enough with the test to understand if the checks it's doing are meaningful.

The fact that it passes despite the dates in the fetching GFAS data being wrong might be an issue.

The fact that the e2e om_prior tests also pass is good from a mechanical point of view, since I believe that means the changes in this PR are successful in integrating the new data source. But it may be bad from a QA point of view if the data we're fetching comes back with dates that don't match the rest of the data being processed, and this doesn't cause any alarms.

In the short term, I believe we should wait to merge this until we hear back from the ADS team about the off-by-one issue. If we must merge this to address the legacy API deprecation on Sept 26, then we may want to shift our ADS API request by one day, and then assert on the dates that we get back so that we can be alerted when/if their bug is fixed.

aethr commented 1 month ago

Rersponse from Kevin Marsh @ ECMWF Support:


I think these variables are accumulated over 24 hours, so looking at the original (GRIB) data for your request :

% grib_ls 32e348309ddfe25309b35ad2f8327b13.grib

32e348309ddfe25309b35ad2f8327b13.grib

edition centre typeOfLevel level dataDate stepRange dataType shortName packingType gridType

1 ecmf surface 0 20230101 0-24 ga ch4fire grid_simple regular_ll

1 ecmf surface 0 20230102 0-24 ga ch4fire grid_simple regular_ll

2 of 2 messages in 32e348309ddfe25309b35ad2f8327b13.grib

So the valid time' is set to the end of the 24 hour period (i.e. 0000 on the following day)

2 of 2 total messages in 1 files

% grib_ls -p validityDate,validityTime 32e348309ddfe25309b35ad2f8327b13.grib

32e348309ddfe25309b35ad2f8327b13.grib

validityDate validityTime

20230102 0

20230103 0

2 of 2 messages in 32e348309ddfe25309b35ad2f8327b13.grib

And this is how the data are encoded in the netCDF file:

2 of 2 total messages in 1 files

% cdo info 7146ff2cddb82ae5c09a7705b64cd2cd.nc

-1 :       Date     Time   Level Gridsize    Miss :     Minimum        Mean     Maximum : Parameter ID

 1 : 2023-01-02 00:00:00       0  6480000       0 :      0.0000  2.5515e-13  3.8500e-08 : -1            

 2 : 2023-01-03 00:00:00       0  6480000       0 :      0.0000  2.8209e-13  3.8500e-08 : -1            

cdo info: Processed 12960000 values from 1 variable over 2 timesteps [0.15s 83MB].

So the time (GMT) is set to the end of the 24 hour period (rather than the start/middle)

aethr commented 1 month ago

So it appears there isn't a bug per-se, we're getting the data we've requested, it's just a change in behaviour from the previous system which encodes the end of the time period in valid_time instead of the beginning, and the unfortunate effect of using midnight instead of 23:59:59 is that the date moves into the next day.

Knowing this, do we need to time-shift the returned dates back by one so they match the rest of our data?

prayner commented 1 month ago

not a bug perhaps but pretty misleading labelling. Yes we do need to shift the dates back by a day since the emissions for Jan 1 (which we divide evenly into the 24 hours for that day) are listed as Jan 2

prayner commented 1 month ago

I'm not seeing this one, already approved?

aethr commented 1 month ago

I haven't completed the change to shift the dates back by one day so they match the dates requested.

Then I think we need to action similar PRs in the other repos that use ADS/CDS data and do some checking of the return dates on those as well to verify the behaviour.

We also need to think about how we'll merge the changes. If we want to update the API keys in Github, we'll need to coordinate merges to all the repos at the same time, and then merge main back into any open PRs/branches. Or we could create new secrets for the new API credentials and eventually deprecate the old ones.

aethr commented 1 month ago

@prayner and I paired to update this branch and shift the dates from GFAS to match the requested dates. All tests are passing locally, they're only failing in CI because the secret still contains the legacy key.