pacificclimate / climate-explorer-data-prep

0 stars 0 forks source link

update_metadata modifies and sets _FillValue as NA value #120

Closed sum1lim closed 4 years ago

sum1lim commented 4 years ago

_FillValues are represented as NA in the output NetCDF file while they are represented as -32768 in this test input file. For example,

ncdump gdd_annual_CanESM2_rcp85_r1i1p1_1951-2100.nc before update_meatadata:

gdd =
  -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 

. . .

    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 
    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 
    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 
    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 
    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 
    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 
    -32768, -32768, -32768, -32768, -32768, -32768, -32768, -32768, 1961, 
    1952, 1941, 1806, 1687, 1757, 1704, 1658, 1701, 1763, 1449, 1793, 1699, 
    1425, 1373, 1427, 1378, 1376, 1325, 1521, 1645, 1662, 2005, 2102, 2161, 
    2184, 2147, 2185, -32768, -32768, -32768, -32768, 2202, 2214, 2231, 2242, 
    2247, 2236, 2194, 2186, 2240, 2301, 2281, 2237, 2220, 1979, 1672, 1370, 
    1420, 1304, 1111, 1035, 998, 1184, 898, 909, 1319, 1389, 982, 902, 979, 
    1061, 884, 1110, 801, 859, 857, 620, 643, 892, 798, 682, 1118, 2093, 
    1820, 1990, 2430, 1474, 1376, 1428, 1526, 1666, 1773, 1790, 1657, 1669, 
    1337, 1639, 2036, 1686, 1613, 2057, 1466, 1229, 1725, 1450, 1653, 1658, 
    2109, 1881, 1761, 1752, 1851, 1526, 1183, 938, 1090, 1144, 1221, 1329, 
    1537, 2125, 2067, 1179, 1293, 1584, 1406, 1449, 1393, 1333, 1335, 1410, 
    1427, 1167, 1170, 1100, 1624, 1943, 1973, 1519, 981, 1081, 1132, 955, 
    1110, 1344, 1481, 1240, 1041, 957, 971, 954, 1017,

. . .

ncdump gdd_annual_CanESM2_rcp85_r1i1p1_1951-2100.nc after update_meatadata:

. . .

gdd = 
  _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 

. . .

  _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, _, 
    _, 1961, 1952, 1941, 1806, 1687, 1757, 1704, 1658, 1701, 1763, 1449, 
    1793, 1699, 1425, 1373, 1427, 1378, 1376, 1325, 1521, 1645, 1662, 2005, 
    2102, 2161, 2184, 2147, 2185, _, _, _, _, 2202, 2214, 2231, 2242, 2247, 
    2236, 2194, 2186, 2240, 2301, 2281, 2237, 2220, 1979, 1672, 1370, 1420, 
    1304, 1111, 1035, 998, 1184, 898, 909, 1319, 1389, 982, 902, 979, 1061, 
    884, 1110, 801, 859, 857, 620, 643, 892, 798, 682, 1118, 2093, 1820, 
    1990, 2430, 1474, 1376, 1428, 1526, 1666, 1773, 1790, 1657, 1669, 1337, 
    1639, 2036, 1686, 1613, 2057, 1466, 1229, 1725, 1450, 1653, 1658, 2109, 
    1881, 1761, 1752, 1851, 1526, 1183, 938, 1090, 1144, 1221, 1329, 1537, 
    2125, 2067, 1179, 1293, 1584, 1406, 1449, 1393, 1333, 1335, 1410, 1427, 
    1167, 1170, 1100, 1624, 1943, 1973, 1519, 981, 1081, 1132, 955, 1110, 
    1344, 1481, 1240, 1041, 957, 971, 954, 1017,

. . .

I am not sure if these changes in _FillValues will cause any problem in further processes since they are meaningless values, but it would be nice to point out if they are actually OK to remain as NA values. I believe the outcome is caused by NetCDF4 Dataset operations that are wrapped by nchelpers CFDataset

sum1lim commented 4 years ago

Fixes pacificclimate/thunderbird#17

corviday commented 4 years ago

I downloaded the test input file and ran update_metadata on it, but did not see the same effect.

jameshiebert commented 4 years ago

Question about this: NA isn't actually a possible value in a NetCDF variable. Values are only represented as NA, based on the value of the _FillValue attribute. So, by definition, changing the metadata of variable attributes could change whether values appear to be NA or not.

Unfortunately, you haven't provided enough information here to be able to determine whether that is what's happening. Could you also provide the full attributes of the gdd variable from ncdump -h both before and after the process?

sum1lim commented 4 years ago

As @corviday mentioned, it looks like the effect is not happening on my workstation as well, so I think this issue can be closed. Maybe there was a confusion working across multiple projects. However, the issue addressed in the PR still remains. @jameshiebert, I would still like to discuss it in the PR.