pacificclimate / pdp_util

A package supplying numerous apps for running PCIC's data server
GNU General Public License v3.0
0 stars 0 forks source link

Peculiar fill values in station data downloaded from PCDS #12

Open rod-glover opened 5 years ago

rod-glover commented 5 years ago

Problem description

Note: This issue is filed in this repo because it appears that the root of the problem is here, not in the higher-level PDP app construction.

When a station does not have values for a variable at all time points in the station's total observation set, a fill value is provided for the absent values.

For example, suppose a station records temperature for one year, then precipitation for a second year. The total observation record spans 2 years. In the data file downloaded for this station, fill values are written to the temperature variable for the second year, and to the precipitation variable for the first year. Both variables have the same length, 2 years.

The fill values are peculiar, in the following ways:

  1. The value is not (at least in observed cases) the missing_value value specified in the metadata (available in NetCDF files).
  2. The value is not anything easily identifiable such as NaN, maximum or minimum value.
  3. The value changes when the server is restarted.
  4. So far, the values observed from several servers (the production server and server instances running locally on a dev machine) are large floats, e.g., 1.48655878184907e+158.

To reproduce

  1. If experimenting with a local dev instance, install pdp and run PCDS locally:

    • Mount /storage
    • Activate the virtual env in which PDP is installed.
    • Set environment variables:

      export CPLUS_INCLUDE_PATH=/usr/include/gdal
      export C_INCLUDE_PATH=/usr/include/gdal
      
      export DSN=postgresql://user:pass@db3.pcic.uvic.ca/pcic_meta
      export DATA_ROOT=http://127.0.0.1:8000/data
      export PCDS_DSN=postgresql://user:pass@db3.pcic.uvic.ca/crmp
      export APP_ROOT=http://127.0.0.1:8000
    • Run app: python scripts/rast_serve.py -p 8000
  2. Open the PCDS app in the browser.

  3. Select start date: 2019/09/25

  4. Select variables: Temperature (Mean)

  5. Draw a polygon around the Williams Lake station.

  6. In Download Data, select format NetCDF, then click Timeseries.

  7. Save file to disk.

  8. Unzip it and examine the contents of 0550502.nc. Since the file is fairly small, ncdump 0550502.nc is usable for this. Note the fill values at the beginning of variable HUMIDITY and at the end of WDIR_VECT. These are the peculiar values which vary by server instance.

Additional information: