rspatial / terra

R package for spatial data handling https://rspatial.github.io/terra/reference/terra-package.html
GNU General Public License v3.0
534 stars 88 forks source link

`terra::rast()` does not correctly interpret a NetCDF file compared to `ncdf4` #1520

Open Jinta0Li opened 3 months ago

Jinta0Li commented 3 months ago

Hello,

I have encountered an issue with the terra package while trying to read a NetCDF file. The file in question can be downloaded from the following link: https://doi.org/10.3929/ethz-b-000648738.

When using terra::rast() to read the NetCDF file, the function does not correctly interpret the file's structure. Specifically:

**1. terra::rast() identifies only 2 variables in the file, whereas the ncdf4::nc_open() function correctly identifies 4 variables.

  1. Thencdf4::nc_open() shows that the file includes a time dimension, butterra::rast()does not recognize the time dimension at all.**

For your reference, here is the code I used to read the file with both packages:

library(ncdf4)
library(terra)

# Reading with ncdf4
nc_ncdf4 <- nc_open("D:/GRACE-SeDA_v1_2002_2022.nc")
print(nc_ncdf4)

# Reading with terra
nc_terra <- rast("D:/GRACE-SeDA_v1_2002_2022.nc")
print(nc_terra)

and here are the results:

>  print(nc_ncdf4)
File D:/GRACE-SeDA_v1_2002_2022.nc (NC_FORMAT_NETCDF4_CLASSIC):

     4 variables (excluding dimension variables):
        byte mask[latitude,longitude]   (Contiguous storage)  
            units: -
            description: The mask showing the valid data points. 1 = valid. 0 = invalid.
        float twsa[time,latitude,longitude]   (Contiguous storage)  
            units: mm
            description: Total water storage anomalies in the format of equivalent water height.
        float twsa_baseline[latitude,longitude]   (Contiguous storage)  
            units: mm
            description: The temporal mean of total water storage anomalies from 2004.000 to 2009.999 which has been removed from the original outputs of the deep learning model.
        float uncertainty[time,latitude,longitude]   (Contiguous storage)  
            units: mm
            description: 1-sigma uncertianties estimated based on deep ensemble and Monte-Carlo simulations.

     3 dimensions:
        time  Size:216 
            units: Modified Julian Date
        latitude  Size:360 
            units: degrees
            description: The latitudes of the center of each 0.5-degree cells.
        longitude  Size:720 
            units: degrees
            description: The lontitudes of the center of each 0.5-degree cells.

    6 global attributes:
        title: GRACE-SeDA: A global total water storage anomaly product with a spatial resolution of 0.5 degrees from self-supervised data assimilation
        institution: Institute of Geodesy and Photogrammetry, ETH Zurich
        authors: Junyang Gou and Benedikt Soja
        contact: Junyang Gou; jungou@ethz.ch
        license: CC-BY: Creative Commons Attribution 4.0 International
        history: 2024.05.16: Create the first version forced by WaterGAPv2.2e-GSWP3-ERA5.

>  print(nc_terra)
class       : SpatRaster 
dimensions  : 720, 360, 2  (nrow, ncol, nlyr)
resolution  : 0.5, 0.5  (x, y)
extent      : -90, 90, -180, 180  (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=WGS84 +no_defs 
sources     : GRACE-SeDA_v1_2002_2022.nc:mask  
              GRACE-SeDA_v1_2002_2022.nc:twsa_baseline  
varnames    : mask 
              twsa_baseline 
names       : mask, twsa_baseline 
unit        :    -,            mm

Could you please investigate this issue or provide guidance on how to correctly read this NetCDF file using the terra package?

Thank you for your attention and assistance.

Best regards,

Jintao

pvanlaake commented 4 days ago

Hi Jintao,

From the print of the file contents it is clear that the axes of the data sets are reversed from the recommendations in the CF Metadata Conventions. While the file does not claim to adhere to the CF conventions, most software for reading netCDF files does. That apparently includes GDAL library, which reads the file for terra. It does manage to identify the axes for variables mask and twsa_baseline, probably by their names, but the 3D variables twsa and uncertainty have a time dimension which is apparently complicating things too much.

It does not stop there. The unit of degrees for the longitude and latitude dimensions should not be used, but rather degrees_east and degrees_north, respectively, as per the CF conventions.

The time unit of Modified Julian Date is also not standard and not described in the readme.txt file. The values obviously represent days from some baseline date but beyond that it's a mystery. Successive data points are mostly around 30 days apart, but it ranges from 11 to 370! From the readme.txt file it is clear that these are months but what are "the 216 valid months when official GRACE(-FO) products are avaliable"?

You can load the data into terra by using ncdf4 to read the arrays of the variables, transpose and flip the arrays as required, and then do terra::rast() on each array. For the time dimension I would suggest that you reach out to the authors of the data set to understand how to map the dimensions values to the "216 valid months".