Unidata / netcdf-fortran

Official GitHub repository for netCDF-Fortran libraries, which depend on the netCDF C library. Install the netCDF C library first.
Other
238 stars 96 forks source link

string attributes are not supported yet? #181

Open leonid-butenko opened 5 years ago

leonid-butenko commented 5 years ago

I'm curious to know if string attributes are supported by current netcdf_fortran library? More and more ncdf products are generated using string attributes.

I'm using version 4.4.5 in my env and I observe the following behaviour:

input nc file:

// global attributes:
                string :title = "test title" ;

nf90_get_att(ncid, NF90_GLOBAL, "title", value)

output:

NetCDF: Attempt to convert between text & numbers

Any ideas/workarounds on how to handle these attributes?

DennisHeimbigner commented 5 years ago

No, see : https://github.com/Unidata/netcdf-fortran/issues/72 Passing a single string could be done, but there are a number of cases where we need to pass an array of strings (roughtly C char**) and we cannot figure out how to do it. Suggestions welcome.

leonid-butenko commented 5 years ago

In most of the products I deal with at the moment a support of single string attributes would be sufficient. I see the problem persists for several years. From that perspective, may be a partial solution with single string support could be provided? It's better than nothing and at least it will cover the needs of some users. Pls advise..

dongli commented 2 years ago

What about string type variables? Still seeing "NetCDF: Attempt to convert between text & numbers!".

Roadelse commented 1 year ago

Still seeing "NetCDF: Attempt to convert between text & numbers!" by now...

edwardhartnett commented 1 year ago

String variables for attributes don't make much sense.

If it is a short string, then just use a char array. If it is a long string, then put it in a data variable as a char array, and turn on compression.

The string type is supported in netCDF so that HDF5 files which use the string type can be read. It's not actually a great idea to use it for new data. For example, it can't be compressed because a string type is just a vlen of char, so all the limitations of vlens apply to strings as well.

graziano-giuliani commented 3 days ago

This is impacting a lot: new CDS-Beta Copernicus data server is serving out ERA5 data with all string attributes:

netcdf sst_2024_07 {
dimensions:
    valid_time = 31 ;
    latitude = 721 ;
    longitude = 1440 ;
variables:
    int64 number ;
        string number:long_name = "ensemble member numerical id" ;
        string number:units = "1" ;
        string number:standard_name = "realization" ;
    int64 valid_time(valid_time) ;
        string valid_time:long_name = "time" ;
        string valid_time:standard_name = "time" ;
        string valid_time:units = "seconds since 1970-01-01" ;
        string valid_time:calendar = "proleptic_gregorian" ;
    double latitude(latitude) ;
        latitude:_FillValue = NaN ;
        string latitude:units = "degrees_north" ;
        string latitude:standard_name = "latitude" ;
        string latitude:long_name = "latitude" ;
        string latitude:stored_direction = "decreasing" ;
    double longitude(longitude) ;
        longitude:_FillValue = NaN ;
        string longitude:units = "degrees_east" ;
        string longitude:standard_name = "longitude" ;
        string longitude:long_name = "longitude" ;
    string expver(valid_time) ;
    float sst(valid_time, latitude, longitude) ;
        sst:_FillValue = NaNf ;
        sst:GRIB_paramId = 34LL ;
        string sst:GRIB_dataType = "an" ;
        sst:GRIB_numberOfPoints = 1038240LL ;
        string sst:GRIB_typeOfLevel = "surface" ;
        sst:GRIB_stepUnits = 1LL ;
        string sst:GRIB_stepType = "instant" ;
        string sst:GRIB_gridType = "regular_ll" ;
        sst:GRIB_uvRelativeToGrid = 0LL ;
        sst:GRIB_NV = 0LL ;
        sst:GRIB_Nx = 1440LL ;
        sst:GRIB_Ny = 721LL ;
        string sst:GRIB_cfName = "unknown" ;
        string sst:GRIB_cfVarName = "sst" ;
        string sst:GRIB_gridDefinitionDescription = "Latitude/Longitude Grid" ;
        sst:GRIB_iDirectionIncrementInDegrees = 0.25 ;
        sst:GRIB_iScansNegatively = 0LL ;
        sst:GRIB_jDirectionIncrementInDegrees = 0.25 ;
        sst:GRIB_jPointsAreConsecutive = 0LL ;
        sst:GRIB_jScansPositively = 0LL ;
        sst:GRIB_latitudeOfFirstGridPointInDegrees = 90. ;
        sst:GRIB_latitudeOfLastGridPointInDegrees = -90. ;
        sst:GRIB_longitudeOfFirstGridPointInDegrees = 0. ;
        sst:GRIB_longitudeOfLastGridPointInDegrees = 359.75 ;
        sst:GRIB_missingValue = 3.40282346638529e+38 ;
        string sst:GRIB_name = "Sea surface temperature" ;
        string sst:GRIB_shortName = "sst" ;
        string sst:GRIB_units = "K" ;
        string sst:long_name = "Sea surface temperature" ;
        string sst:units = "K" ;
        string sst:standard_name = "unknown" ;
        sst:GRIB_surface = 0. ;
        string sst:coordinates = "number valid_time latitude longitude expver" ;

// global attributes:
        string :GRIB_centre = "ecmf" ;
        string :GRIB_centreDescription = "European Centre for Medium-Range Weather Forecasts" ;
        :GRIB_subCentre = 0LL ;
        string :Conventions = "CF-1.7" ;
        string :institution = "European Centre for Medium-Range Weather Forecasts" ;
        string :history = "2024-10-02T11:01 GRIB to CDM+CF via cfgrib-0.9.14.1/ecCodes-2.36.0 with {\"source\": \"data.grib\", \"filter_by_keys\": {\"stream\": [\"oper\"]}, \"encode_cf\": [\"parameter\", \"time\", \"geography\", \"vertical\"]}" ;
}

All Fortran code trying to read any of those attributes fails miserably with NetCDF: Attempt to convert between text & numbers

edwardhartnett commented 3 days ago

OK, if you are trying for NUG or CF conventions, those attributes are all supposed to be text arrays, not strings, IIRC.

graziano-giuliani commented 3 days ago

What about just extending the nf90_get_att_text?

diff --git a/fortran/netcdf_attributes.F90 b/fortran/netcdf_attributes.F90
index 014654c..6171952 100644
--- a/fortran/netcdf_attributes.F90
+++ b/fortran/netcdf_attributes.F90
@@ -67,13 +67,52 @@
   end function nf90_put_att_text
   ! -------
   function nf90_get_att_text(ncid, varid, name, values)
+    use, intrinsic :: iso_c_binding, only: c_ptr, c_size_t, c_f_pointer, c_int
+    implicit none
     integer,                          intent( in) :: ncid, varid
     character(len = *),               intent( in) :: name
     character(len = *),               intent(out) :: values
     integer                                       :: nf90_get_att_text
-
-    values = ' '  !! make sure result will be blank padded
-    nf90_get_att_text = nf_get_att_text(ncid, varid, name, values)
+    interface
+      integer(c_int) function nc_get_att_string(ncid, varid, name, pp) bind(c)
+        use iso_c_binding , only : c_int , c_char , c_ptr
+        integer(c_int) , value :: ncid , varid
+        character(kind=c_char) , intent(in) :: name
+        type(c_ptr), intent(out) :: pp
+      end function nc_get_att_string
+    end interface
+    interface
+      integer(c_size_t) function strlen(cs) bind(c, name='strlen')
+         use, intrinsic :: iso_c_binding , only : c_size_t , c_ptr
+         implicit none
+         type(c_ptr), intent(in), value :: cs
+      end function strlen
+    end interface
+    integer :: xtype , nlen , attid , i
+    integer(c_int) :: c_ncid , c_varid , c_status , c_nlen
+    type(c_ptr) :: c_str
+    character(len_trim(name)+1) :: c_aname
+    character , pointer :: f_str(:)
+    nf90_get_att_text = nf90_inquire_attribute(ncid, varid, name, &
+        xtype, nlen, attid)
+    if ( nf90_get_att_text == nf90_noerr ) then
+      if ( xtype == nf90_string .and. nlen == 1 ) then
+        c_ncid = ncid
+        c_varid = varid - 1
+        c_aname = name//char(0)
+        c_status = nc_get_att_string(c_ncid, c_varid, c_aname, c_str)
+        nf90_get_att_text = c_status
+        if ( nf90_get_att_text == nf90_noerr ) then
+          call c_f_pointer(c_str,f_str,[strlen(c_str)])
+          values = adjustl("")
+          do i = 1, size(f_str)
+            values(i:i) = f_str(i)
+          end do
+        end if
+      else
+        values = ' '  !! make sure result will be blank padded
+        nf90_get_att_text = nf_get_att_text(ncid,varid,name,values)
+      end if
+    end if
   end function nf90_get_att_text
   ! -------
   ! Integer attributes
edwardhartnett commented 3 days ago

OK, to sum up, you've been told:

Yet you're answer is that we should change the library? How about instead you change your format to one that is in conformance with existing functionality, all past examples, and all current conventions?

graziano-giuliani commented 3 days ago

It is not MY code that it is creating the non CF compliant (albeit claiming it to be CF compliant, see above), but this site:

https://cds.climate.copernicus.eu/

I just need to use the data in my Fortran program, as most probably A LOT of other users. It is just a heads up that the amount of people asking for this to be fixed will grow, grow, grow. I have already the fix in my code, thank you.

DennisHeimbigner commented 3 days ago

It is unfortunate that Copernicus is using e.g.: string number:standard_name = "realization" ; instead of char number:standard_name = "realization" ; The latter can be handled by Fortran. But I have vague recollection that someone was proposing better support for counted strings in Fortran that would have been usable to store netcdf-c strings. Does anyone know if that happened?

graziano-giuliani commented 3 days ago

I think it still didn't happen (first proposal to J3 is in the 1982 !!). As per the ISO, this is the reference document outlining it:

https://www.iso.org/standard/26934.html

Only know of some proposal implementation in a standard library, but no compiler I know has picked this up:

https://github.com/everythingfunctional/iso_varying_string

graziano-giuliani commented 3 days ago

Looks like string and char are now BOTH allowed in latest CF-1.11, with no preference. Still, they are not in CF-1.7, which the Copernicus claims to be using, though.

https://cfconventions.org/Data/cf-conventions/cf-conventions-1.11/cf-conventions.html#char-and-string-variables-ex

graziano-giuliani commented 3 days ago

Cross posting here the ECMWF forum post: https://forum.ecmwf.int/t/cf-1-7-compliance-of-netcdf-data-from-cds-beta/6587?u=graziano_giuliani

edwardhartnett commented 2 days ago

Well you think the CF people would have tried this before making strings acceptable.

graziano-giuliani commented 2 days ago

It was apparently already allowed in CF-1.9:

https://cfconventions.org/Data/cf-documents/requirements-recommendations/conformance-1.9.html Last updated 2021-09-10 14:51:35 UTC

My guess is they are testing using Python, or even directly from xarray... Granted, no dataset with strings attribute will be created in the foreseeable future from a running Fortran program for the lack of a matching type in the language. Note also that for "[insert here any whatever reason]" the ECMWF is using NC_INT64 type to store integer attributes like "0", generating warning messages from a lot of easy viewers (ncview < 2.1.10, for example, ncview: netcdf_dim_value: unknown data type (10) for dimension valid_time).

WardF commented 2 days ago

Pinging @ethanrd to weigh in on the CF-related discussion.

ethanrd commented 2 days ago

Thanks @WardF. I started a CF discussion (#378) referencing this issue earlier today. Some discussion so far of adding cautionary text to CF about these limitations and trying to review new features across all major libraries going forward.