simion1232006 / nctoolbox

Automatically exported from code.google.com/p/nctoolbox
0 stars 0 forks source link

empty cell array returned for axes attributes over opendap using ncdataset #57

Open GoogleCodeExporter opened 8 years ago

GoogleCodeExporter commented 8 years ago
What steps will reproduce the problem?
1. ifn = 
'http://data.nodc.noaa.gov/opendap/ghrsst/L4/GLOB/JPL_OUROCEAN/G1SST/2010/160/20
100609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.bz2'
2. ds = ncdataset(ifn);  % takes 1-2 minutes to get access to compressed file
3. ds.attributes('time') or 'lat' or 'lon'  --- axes variables only!

What is the expected output? 

cell array with attributes like 'units', 'long_name', 'standard_name'

What do you see instead? 

empty cell array

>> ds.attributes('time')

ans =

   Empty cell array: 0-by-2

What version of the product are you using? On what operating system?

Please provide any additional information below.

can get all the data and most of the metadata via ncdataset except the 
attributes for axes variables.  In the following example lat, lon, and time are 
axes variables.  Not the empty cell array with ds.attributes('time') from 
MATLAB session below.

But when I download the file and uncompress it locally and use ncdataset on the 
local file, I get the attributes I desire.

----------------------------------

>> ifn = 
'http://data.nodc.noaa.gov/opendap/ghrsst/L4/GLOB/JPL_OUROCEAN/G1SST/2010/160/20
100609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.bz2'
>> ds = ncdataset(ifn);  % takes 1-2 minutes to get access to compressed file
>> ds.variables

ans =

    'analysed_sst'
    'mask'
    'analysis_error'
    'time'
    'lat'
    'lon'

>> ds.attributes

ans =

    'time.units'               'seconds since 1981-01-01 00:00:00'
    'time.long_name'           'reference time of sst field'     
    'time.standard_name'       'time'                            
    'time.axis'                'T'                               
    'time.calendar'            'Gregorian'                       
<clipped>

>> ds.attributes('analysed_sst')

ans =

    '_CoordinateAxes'    'time lat lon '                  
    'long_name'          'analysed sea surface temperature'
    'standard_name'      'sea_surface_temperature'        
    'type'               'foundation'                     
    'units'              'kelvin'                         

>> ds.attributes('time')

ans =

   Empty cell array: 0-by-2

Original issue reported on code.google.com by sara.m.h...@gmail.com on 1 Nov 2012 at 9:10

GoogleCodeExporter commented 8 years ago
I'm investigating and making some notes here. Here's the DDL for your file when 
read in Matlab:

netcdf 
dods://data.nodc.noaa.gov/opendap/ghrsst/L4/GLOB/JPL_OUROCEAN/G1SST/2010/160/201
00609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.bz2 {
 dimensions:
   time = 1;
   lat = 16000;
   lon = 36000;
 variables:
   float analysed_sst(time=1, lat=16000, lon=36000);
     :_CoordinateAxes = "time lat lon ";
     :long_name = "analysed sea surface temperature";
     :standard_name = "sea_surface_temperature";
     :type = "foundation";
     :units = "kelvin";
   byte mask(time=1, lat=16000, lon=36000);
     :_Unsigned = "true";
     :_CoordinateAxes = "time lat lon ";
     :long_name = "sea/land/lake/ice field composite mask";
     :_FillValue = -128B; // byte
     :flag_values = "1b, 2b, 4b, 8b";
     :flag_meanings = "sea land lake ice";
     :comment = "b0: 1=grid cell is open sea water012b1: 1=land is present in this grid cell012b2: 1=lake surface is present in this grid cell012b3: 1=sea ice is present in this grid cell012b4-b7: reserved for future grid mask data";
   float analysis_error(time=1, lat=16000, lon=36000);
     :_CoordinateAxes = "time lat lon ";
     :long_name = "estimated error standard deviation of analysed_sst";
     :units = "kelvin";
   int time(time=1);
   float lat(lat=16000);
   float lon(lon=36000);

 :time.units = "seconds since 1981-01-01 00:00:00";
 :time.long_name = "reference time of sst field";
 :time.standard_name = "time";
 :time.axis = "T";
 :time.calendar = "Gregorian";
 :lon.long_name = "longitude";
 :lon.standard_name = "longitude";
 :lon.axis = "X";
 :lon.units = "degrees_east";
 :lat.long_name = "latitude";
 :lat.standard_name = "latitude";
 :lat.axis = "Y";
 :lat.units = "degrees_north";
 :Conventions = "CF-1.0";
 :title = "G1SST, 1km blended SST";
 :DSD_entry_id = "JPL_OUROCEAN-L4UHfnd-GLOB-G1SST";
 :references = "A Blended Global 1-km Sea Surface Temperature Data Set for Research and Applications012by Yi Chao, Benyang Tang, Zhijin Li, Peggy Li, Quoc Vu";
 :institution = "Jet Propulsion Laboratory, The OurOcean Team";
 :contact = "yi.chao@jpl.nasa.gov";
 :GDS_version_id = "v1.0-rev1.7";
 :netcdf_version_id = "3.6.0";
 :creation_date = "2010-06-14 UTC";
 :product_version = "1.0";
 :history = "1km SST blended from 8 satellite observations";
 :spatial_resolution = "1 km";
 :source_data = "AMSRE,AVHRR,TMI,MODIS,MODIS,GOES,METOP,MTSAT,SEVIRI,AATSR,in-situ";
 :comment = "";
 :start_date = "2010-06-14 UTC";
 :start_time = "00:00:00 UTC";
 :stop_date = "2010-06-10 UTC";
 :stop_time = "00:00:00 UTC";
 :southernmost_latitude = -80.0f; // float
 :northernmost_latitude = 80.0f; // float
 :westernmost_longitude = -180.0f; // float
 :easternmost_longitude = 180.0f; // float
 :file_quality_index = 0; // int
}

Original comment by bschlin...@gmail.com on 1 Nov 2012 at 10:04

GoogleCodeExporter commented 8 years ago
Downloaded the file (it's a biggie). Here's the CDL when opened locally:

netcdf 
/Users/brian/Downloads/20100609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.bz
2 {
 dimensions:
   time = 1;
   lon = 36000;
   lat = 16000;
 variables:
   float analysed_sst(time=1, lat=16000, lon=36000);
     :long_name = "analysed sea surface temperature";
     :standard_name = "sea_surface_temperature";
     :type = "foundation";
     :units = "kelvin";
   byte mask(time=1, lat=16000, lon=36000);
     :long_name = "sea/land/lake/ice field composite mask";
     :_FillValue = -128B; // byte
     :flag_values = "1b, 2b, 4b, 8b";
     :flag_meanings = "sea land lake ice";
     :comment = "b0: 1=grid cell is open sea water\nb1: 1=land is present in this grid cell\nb2: 1=lake surface is present in this grid cell\nb3: 1=sea ice is present in this grid cell\nb4-b7: reserved for future grid mask data";
   float analysis_error(time=1, lat=16000, lon=36000);
     :long_name = "estimated error standard deviation of analysed_sst";
     :units = "kelvin";
   int time(time=1);
     :units = "seconds since 1981-01-01 00:00:00";
     :long_name = "reference time of sst field";
     :standard_name = "time";
     :axis = "T";
     :calendar = "Gregorian";
     :_CoordinateAxisType = "Time";
   float lon(lon=36000);
     :long_name = "longitude";
     :standard_name = "longitude";
     :axis = "X";
     :units = "degrees_east";
     :_CoordinateAxisType = "Lon";
   float lat(lat=16000);
     :long_name = "latitude";
     :standard_name = "latitude";
     :axis = "Y";
     :units = "degrees_north";
     :_CoordinateAxisType = "Lat";

 :Conventions = "CF-1.0";
 :title = "G1SST, 1km blended SST";
 :DSD_entry_id = "JPL_OUROCEAN-L4UHfnd-GLOB-G1SST";
 :references = "A Blended Global 1-km Sea Surface Temperature Data Set for Research and Applications\nby Yi Chao, Benyang Tang, Zhijin Li, Peggy Li, Quoc Vu";
 :institution = "Jet Propulsion Laboratory, The OurOcean Team";
 :contact = "yi.chao@jpl.nasa.gov";
 :GDS_version_id = "v1.0-rev1.7";
 :netcdf_version_id = "3.6.0";
 :creation_date = "2010-06-14 UTC";
 :product_version = "1.0";
 :history = "1km SST blended from 8 satellite observations";
 :spatial_resolution = "1 km";
 :source_data = "AMSRE,AVHRR,TMI,MODIS,MODIS,GOES,METOP,MTSAT,SEVIRI,AATSR,in-situ";
 :comment = "";
 :start_date = "2010-06-14 UTC";
 :start_time = "00:00:00 UTC";
 :stop_date = "2010-06-10 UTC";
 :stop_time = "00:00:00 UTC";
 :southernmost_latitude = -80.0f; // float
 :northernmost_latitude = 80.0f; // float
 :westernmost_longitude = -180.0f; // float
 :easternmost_longitude = 180.0f; // float
 :file_quality_index = 0; // int
}

Original comment by bschlin...@gmail.com on 1 Nov 2012 at 10:16

GoogleCodeExporter commented 8 years ago
OK, this I can confirm that there's a bug. Unfortunately, it's not in 
nctoolbox, but rather it's lurking somewhere in the NetCDF-Java libraries. I'll 
submit a bug report to unidata.

Original comment by bschlin...@gmail.com on 1 Nov 2012 at 10:18

GoogleCodeExporter commented 8 years ago
Actually, now that I think about it. The bug may be in any of the following: 
NetCDF-Java, opendap's Java libraries or even on the opendap server itself. 
<sheesh>. I'll try testing opening it with another opendap client.

Original comment by bschlin...@gmail.com on 1 Nov 2012 at 10:22

GoogleCodeExporter commented 8 years ago
Hmmm, When I try to access the opendap dds on the site at 
http://data.nodc.noaa.gov/opendap/ghrsst/L4/GLOB/JPL_OUROCEAN/G1SST/2010/160/201
00609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.bz2.dds it returns: 

Error {
    code = 1001;
    message = "Could not open bes#www#sites#data.nodc#htdocs##ghrsst#L4#GLOB#JPL_OUROCEAN#G1SST#2010#160#20100609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.";
};

I wonder if that's the problem. You can access descriptor information for the 
file at 
http://data.nodc.noaa.gov/opendap/ghrsst/L4/GLOB/JPL_OUROCEAN/G1SST/2010/160/

Original comment by bschlin...@gmail.com on 1 Nov 2012 at 10:27

GoogleCodeExporter commented 8 years ago
I sent an email to the NetCDF-Java mailing list describing the problem. I also 
can confirm that comment #5 above isn't the issue, the error is returned 
occasionally and when it does nctoolbox throws an exception and fails to open 
the dataset.

Anyway, when/if I hear back from the mailing list I'll post a response here.

Original comment by bschlin...@gmail.com on 1 Nov 2012 at 10:49

GoogleCodeExporter commented 8 years ago
Here's a great response from Rich Signell:

"Brian,

I was unable to access that bzip2 file URL via OPeNDAP
http://data.nodc.noaa.gov/opendap/ghrsst/L4/GLOB/JPL_OUROCEAN/G1SST/2010/160/201
00609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc.bz2

But I did try downloading the file, unzipping it, and putting it on our TDS:
http://geoport.whoi.edu/thredds/catalog/usgs/data1/rsignell/test/catalog.html?da
taset=usgs/data1/rsignell/test/20100609-JPL_OUROCEAN-L4UHfnd-GLOB-v01-fv01_0-G1S
ST.nc

And if I then try accessing the OPeNDAP URL
http://geoport.whoi.edu/thredds/dodsC/usgs/data1/rsignell/test/20100609-JPL_OURO
CEAN-L4UHfnd-GLOB-v01-fv01_0-G1SST.nc
it works fine.

I note that by using BZIP2 compression, NODC manages to compress this
2.8GB NetCDF3 file down to 123MB, but of course, that makes it very
awkward to extract data from using OPeNDAP. If instead they
converted the NetCDF3 to NetCDF4, they could get the file easily down
to 250MB by using simple deflation, and the efficiency of extraction
would be *way* better.

Check out the NetCDF4 chunked version:
http://geoport.whoi.edu/thredds/catalog/usgs/data1/rsignell/test/catalog.html?da
taset=usgs/data1/rsignell/test/sst_netcdf4_chunk100.nc

-Rich"

Original comment by bschlin...@gmail.com on 5 Nov 2012 at 6:18

GoogleCodeExporter commented 8 years ago
Rich, I hope it's OK, but I'm going to forward your suggestion to the NODC 
folks.

Original comment by bschlin...@gmail.com on 5 Nov 2012 at 6:26

GoogleCodeExporter commented 8 years ago
Here's a reply I received from Ken Casey at NODC in response to Rich's 
workaround in Comment 7 above:

"Hi Brian,

We are (painfully) aware of the limitations of the netCDF-3 / bzip2 
combination, especially for truly gigantic global 1km resolution data like the 
G1SST product.  Unfortunately, the GHRSST Data Specification Version 1 
procedures and standards used for the existing 55 terabytes and 2.38 million 
GHRSST netCDF files were established long before netCDF-4 even existed, and we 
still receive around 1000 new files in this format every day.  We have looked 
at the situation and don't currently have the computational resources to 
decompress, convert to netCDF-4, and then re-archive everything.  Please 
understand that we do not choose the compression scheme or the file format 
version for these data.  In fact, we are very active in advocating the use of 
internally compressed netCDF-4 for the ocean data community (see for example 
our NODC netCDF-4 templates page at 
http://www.nodc.noaa.gov/data/formats/netcdf/) and our own GHRSST v2 product, 
Pathfinder Version 5.2, is in internally compressed netCDF -4 ( see 
http://pathfinder.nodc.noaa.gov).

However, there are a couple of related things going on.  First, the GHRSST 
community has published a Version 2 of the GHRSST Data Specification which 
encourages the use of chunked netCDF-4 data.  Many of the GHRSST data providers 
(the RDACs) are in the process of preparing their data in netCDF-4 and some are 
even planning on reprocessing their older data.  If I am not mistaken, JPL is 
planning on converting to netCDF-4 soon.  Second, we are considering some 
experiments using cloud computing resources that would allow us to convert the 
older data in a cost effective manner.  

I wish we could do more to improve the efficiency of access to these data at 
this time but we will continue working with the GHRSST community and exploring 
ways to do a more massive conversion of the older data in a cost effective way.

Ken"

Original comment by bschlin...@gmail.com on 6 Nov 2012 at 4:55