Problem description

Note: This issue is filed in this repo because it appears that the root of the problem is here, not in the higher-level PDP app construction.

When a station does not have values for a variable at all time points in the station's total observation set, a fill value is provided for the absent values.

For example, suppose a station records temperature for one year, then precipitation for a second year. The total observation record spans 2 years. In the data file downloaded for this station, fill values are written to the temperature variable for the second year, and to the precipitation variable for the first year. Both variables have the same length, 2 years.

The fill values are peculiar, in the following ways:

The value is not (at least in observed cases) the missing_value value specified in the metadata (available in NetCDF files).
The value is not anything easily identifiable such as NaN, maximum or minimum value.
The value changes when the server is restarted.
So far, the values observed from several servers (the production server and server instances running locally on a dev machine) are large floats, e.g., 1.48655878184907e+158.

To reproduce

If experimenting with a local dev instance, install pdp and run PCDS locally:

Mount /storage
Activate the virtual env in which PDP is installed.

Set environment variables:

export CPLUS_INCLUDE_PATH=/usr/include/gdal
export C_INCLUDE_PATH=/usr/include/gdal

export DSN=postgresql://user:pass@db3.pcic.uvic.ca/pcic_meta
export DATA_ROOT=http://127.0.0.1:8000/data
export PCDS_DSN=postgresql://user:pass@db3.pcic.uvic.ca/crmp
export APP_ROOT=http://127.0.0.1:8000

Run app: python scripts/rast_serve.py -p 8000

Open the PCDS app in the browser.
Select start date: 2019/09/25
Select variables: Temperature (Mean)
Draw a polygon around the Williams Lake station.
In Download Data, select format NetCDF, then click Timeseries.
Save file to disk.
Unzip it and examine the contents of 0550502.nc. Since the file is fairly small, ncdump 0550502.nc is usable for this. Note the fill values at the beginning of variable HUMIDITY and at the end of WDIR_VECT. These are the peculiar values which vary by server instance.

Additional information:

The non-fill values that appear for the variables have, in this particular example, been (partially) verified by querying the database directly. The only thing that seems to be amiss is the fill values.
If you restart the local server, a different peculiar fill value is provided for the same dataset.

pacificclimate / pdp_util

Peculiar fill values in station data downloaded from PCDS #12

Problem description

To reproduce