pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.07k stars 297 forks source link

loading of variables does not work for Scene created with satpy_cf_nc reader #2609

Open johanna-mayer opened 1 year ago

johanna-mayer commented 1 year ago

Describe the bug After reading in a Scene using the satpy_cf_nc reader from a netCDF file, loading of variables throws an error. The netCDF file was created using the satpy cf_writer.

To Reproduce

from satpy import Scene
import glob

# load Seviri data 
filenames = glob.glob('data/HRIT/*201707011245*')
scn = Scene(filenames=filenames, reader='seviri_l1b_hrit')
scn.load(['VIS006', 'IR_108'])

# compose filename s.t. it is compatible with satpy_cf_nc reader
attributes = scn[scn.keys()[0]].attrs
platform_name = attributes['platform_name']
sensor = attributes['sensor']
start_time = attributes['start_time']
end_time = attributes['end_time']
filename = f'{platform_name}-{sensor}-{start_time:%Y%m%d%H%M%S}-{end_time:%Y%m%d%H%M%S}.nc'

# save as netCDF
scn.save_datasets(
            writer='cf', 
            filename=os.path.join('/tmp', filename)
            )

# read data back in using satpy_cf_nc reader
scn1 = Scene(
    reader='satpy_cf_nc', 
    filenames=[os.path.join('/tmp', filename)],
) 
scn1.load(['IR_108'])

Expected behavior l expected that the data of the 'IR_108' channel in this example would load such that I can access it with scn1['IR_108'].

Actual results

[ERROR: 2023-10-23 18:55:07 : satpy.readers.yaml_reader] Could not load dataset 'DataID(name='IR_108', wavelength=WavelengthRange(min=9.8, central=10.8, max=11.8, unit='µm'), resolution=3000.403165817, calibration=<calibration.brightness_temperature>, modifiers=())': dictionary update sequence element #0 has length 1; 2 is required
Traceback (most recent call last):
  File "/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/readers/yaml_reader.py", line 823, in _load_dataset_with_area
    ds = self._load_dataset_data(file_handlers, dsid, **kwargs)
  File "/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/readers/yaml_reader.py", line 723, in _load_dataset_data
    proj = self._load_dataset(dsid, ds_info, file_handlers, **kwargs)
  File "/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/readers/yaml_reader.py", line 715, in _load_dataset
    combined_info = file_handlers[0].combine_info(
  File "/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/readers/file_handlers.py", line 118, in combine_info
    new_dict.update(self._combine_time_parameters(all_infos))
  File "/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/readers/file_handlers.py", line 155, in _combine_time_parameters
    time_params_comb.update(d)
ValueError: dictionary update sequence element #0 has length 1; 2 is required
[WARNING: 2023-10-23 18:55:07 : satpy.scene] The following datasets were not created and may require resampling to be generated: DataID(name='IR_108', wavelength=WavelengthRange(min=9.8, central=10.8, max=11.8, unit='µm'), resolution=3000.403165817, calibration=<calibration.brightness_temperature>, modifiers=())

Environment Info:

  abi_l1b:  ok
  abi_l1b_scmi:  ok
  abi_l2_nc:  ok
  acspo:  ok
  agri_fy4a_l1:  ok
  agri_fy4b_l1:  ok
  ahi_hrit:  ok
  ahi_hsd:  ok
  ahi_l1b_gridded_bin:  ok
  ami_l1b:  ok
  amsr2_l1b:  ok
  amsr2_l2:  ok
  amsr2_l2_gaasp:  ok
  amsub_l1c_aapp:  ok
  ascat_l2_soilmoisture_bufr:  cannot find module 'satpy.readers.ascat_l2_soilmoisture_bufr' (('Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes.\n           Error: ', ModuleNotFoundError("No module named 'eccodes'")))
  atms_l1b_nc:  ok
  atms_sdr_hdf5:  ok
  avhrr_l1b_aapp:  ok
  avhrr_l1b_eps:  ok
  avhrr_l1b_gaclac:  cannot find module 'satpy.readers.avhrr_l1b_gaclac' (No module named 'pygac')
  avhrr_l1b_hrpt:  ok
  avhrr_l1c_eum_gac_fdr_nc:  ok
  caliop_l2_cloud:  cannot find module 'satpy.readers.caliop_l2_cloud' (cannot import name 'Dataset' from 'satpy.dataset' (/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/dataset/__init__.py))
  clavrx:  ok
  cmsaf-claas2_l2_nc:  ok
  electrol_hrit:  ok
  epic_l1b_h5:  ok
  fci_l1c_nc:  ok
  fci_l2_nc:  ok
  generic_image:  ok
  geocat:  ok
  ghi_l1:  ok
  ghrsst_l2:  ok
  glm_l2:  ok
  gms5-vissr_l1b:  cannot find module 'satpy.readers.gms.gms5_vissr_l1b' (No module named 'numba')
  goes-imager_hrit:  ok
  goes-imager_nc:  ok
  gpm_imerg:  ok
  grib:  cannot find module 'satpy.readers.grib' (No module named 'pygrib')
  hsaf_grib:  cannot find module 'satpy.readers.hsaf_grib' (No module named 'pygrib')
  hsaf_h5:  ok
  hy2_scat_l2b_h5:  ok
  iasi_l2:  ok
  iasi_l2_cdr_nc:  ok
  iasi_l2_so2_bufr:  cannot find module 'satpy.readers.iasi_l2_so2_bufr' (('Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes.\n           Error: ', ModuleNotFoundError("No module named 'eccodes'")))
  ici_l1b_nc:  ok
  insat3d_img_l1b_h5:  cannot find module 'satpy.readers.insat3d_img_l1b_h5' (No module named 'datatree' It can be installed with the xarray-datatree package.)
  jami_hrit:  ok
  li_l2_nc:  ok
  maia:  ok
  meris_nc_sen3:  ok
  mersi2_l1b:  ok
  mersi_ll_l1b:  ok
  mhs_l1c_aapp:  ok
  mimicTPW2_comp:  ok
  mirs:  ok
  modis_l1b:  ok
  modis_l2:  ok
  msi_safe:  ok
  msu_gsa_l1b:  ok
  mtsat2-imager_hrit:  ok
  mviri_l1b_fiduceo_nc:  ok
  mwi_l1b_nc:  ok
  mws_l1b_nc:  ok
  nucaps:  ok
  nwcsaf-geo:  ok
  nwcsaf-msg2013-hdf5:  ok
  nwcsaf-pps_nc:  ok
  oceancolorcci_l3_nc:  ok
  olci_l1b:  ok
  olci_l2:  ok
  omps_edr:  ok
  safe_sar_l2_ocn:  ok
  sar-c_safe:  ok
  satpy_cf_nc:  ok
  scatsat1_l2b:  cannot find module 'satpy.readers.scatsat1_l2b' (cannot import name 'Dataset' from 'satpy.dataset' (/athome/maye_jh/miniforge3/envs/satenv_updated/lib/python3.8/site-packages/satpy/dataset/__init__.py))
  seadas_l2:  ok
  seviri_l1b_hrit:  ok
  seviri_l1b_icare:  ok
  seviri_l1b_native:  ok
  seviri_l1b_nc:  ok
  seviri_l2_bufr:  cannot find module 'satpy.readers.seviri_l2_bufr' (Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes)
  seviri_l2_grib:  cannot find module 'satpy.readers.seviri_l2_grib' (Missing eccodes-python and/or eccodes C-library installation. Use conda to install eccodes)
  slstr_l1b:  ok
  slstr_l2:  cannot find module 'satpy.readers.slstr_l2' (No module named 'satpy.readers.slstr_l2')
  smos_l2_wind:  ok
  tropomi_l2:  ok
  vaisala_gld360:  ok
  vii_l1b_nc:  ok
  vii_l2_nc:  ok
  viirs_compact:  ok
  viirs_edr_active_fires:  ok
  viirs_edr_flood:  ok
  viirs_l1b:  ok
  viirs_l2_cloud_mask_nc:  ok
  viirs_sdr:  ok
  viirs_vgac_l1c_nc:  ok
  virr_l1b:  ok

  Writers:

  awips_tiled:  ok
  cf:  ok
  geotiff:  ok
  mitiff:  ok
  ninjogeotiff:  ok
  ninjotiff:  cannot find module 'satpy.writers.ninjotiff' (No module named 'pyninjotiff')
  simple_image:  ok

  Extras:

  cartopy:  ok
  geoviews:  No module named 'geoviews'
djhoese commented 1 year ago

Is this code you provided exactly what you ran? I'm surprised it even got this far as your filename is not a real filename but format string.

Could you also provide a ncdump -h of the produced NetCDF file?

johanna-mayer commented 1 year ago

Yes, that is exactly what I ran. Are format strings problematic for the writer?

Sure, here is the ncdump -h of the created file:

netcdf Meteosat-10-seviri-20170701124500-20170701130000 {
dimensions:
        y = 3712 ;
        x = 3712 ;
variables:
        int64 msg_seviri_fes_3km ;
                msg_seviri_fes_3km:crs_wkt = "PROJCRS[\"unknown\",BASEGEOGCRS[\"unknown\",DATUM[\"unknown\",ELLIPSOID[\"unknown\",6378169,295.488065897014,LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]]],PRIMEM[\"Greenwich\",0,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8901]]],CONVERSION[\"unknown\",METHOD[\"Geostationary Satellite (Sweep Y)\"],PARAMETER[\"Longitude of natural origin\",0,ANGLEUNIT[\"degree\",0.0174532925199433],ID[\"EPSG\",8802]],PARAMETER[\"Satellite Height\",35785831,LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]],PARAMETER[\"False easting\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8806]],PARAMETER[\"False northing\",0,LENGTHUNIT[\"metre\",1],ID[\"EPSG\",8807]]],CS[Cartesian,2],AXIS[\"(E)\",east,ORDER[1],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]],AXIS[\"(N)\",north,ORDER[2],LENGTHUNIT[\"metre\",1,ID[\"EPSG\",9001]]]]" ;
                msg_seviri_fes_3km:false_easting = 0. ;
                msg_seviri_fes_3km:false_northing = 0. ;
                msg_seviri_fes_3km:geographic_crs_name = "unknown" ;
                msg_seviri_fes_3km:grid_mapping_name = "geostationary" ;
                msg_seviri_fes_3km:horizontal_datum_name = "unknown" ;
                msg_seviri_fes_3km:inverse_flattening = 295.488065897014 ;
                msg_seviri_fes_3km:latitude_of_projection_origin = 0. ;
                msg_seviri_fes_3km:long_name = "msg_seviri_fes_3km" ;
                msg_seviri_fes_3km:longitude_of_prime_meridian = 0. ;
                msg_seviri_fes_3km:longitude_of_projection_origin = 0. ;
                msg_seviri_fes_3km:perspective_point_height = 35785831. ;
                msg_seviri_fes_3km:prime_meridian_name = "Greenwich" ;
                msg_seviri_fes_3km:projected_crs_name = "unknown" ;
                msg_seviri_fes_3km:reference_ellipsoid_name = "unknown" ;
                msg_seviri_fes_3km:semi_major_axis = 6378169. ;
                msg_seviri_fes_3km:semi_minor_axis = 6356583.8 ;
                msg_seviri_fes_3km:sweep_angle_axis = "y" ;
        double IR_108_acq_time(y) ;
                IR_108_acq_time:_FillValue = NaN ;
                IR_108_acq_time:long_name = "Mean scanline acquisition time" ;
                IR_108_acq_time:units = "milliseconds since 2017-07-01 12:45:26.581000" ;
                IR_108_acq_time:calendar = "proleptic_gregorian" ;
        double y(y) ;
                y:units = "m" ;
                y:standard_name = "projection_y_coordinate" ;
        double x(x) ;
                x:units = "m" ;
                x:standard_name = "projection_x_coordinate" ;
        double longitude(y, x) ;
                longitude:_FillValue = NaN ;
                longitude:name = "longitude" ;
                longitude:standard_name = "longitude" ;
                longitude:units = "degrees_east" ;
        double latitude(y, x) ;
                latitude:_FillValue = NaN ;
                latitude:name = "latitude" ;
                latitude:standard_name = "latitude" ;
                latitude:units = "degrees_north" ;
        float IR_108(y, x) ;
                IR_108:_FillValue = NaNf ;
                IR_108:calibration = "brightness_temperature" ;
                IR_108:end_time = "2017-07-01 13:00:00" ;
                IR_108:georef_offset_corrected = "false" ;
                IR_108:grid_mapping = "msg_seviri_fes_3km" ;
                IR_108:modifiers = "" ;
                IR_108:nominal_end_time = "2017-07-01 13:00:00" ;
                IR_108:nominal_start_time = "2017-07-01 12:45:00" ;
                IR_108:orbital_parameters = "{\"projection_longitude\": 0.0, \"projection_latitude\": 0.0, \"projection_altitude\": 35785831.0, \"satellite_nominal_longitude\": 0.0, \"satellite_nominal_latitude\": 0.0, \"satellite_actual_longitude\": -0.14103548610869715, \"satellite_actual_latitude\": 0.7857524055230543, \"satellite_actual_altitude\": 35779085.35822027}" ;
                IR_108:platform_name = "Meteosat-10" ;
                IR_108:reader = "seviri_l1b_hrit" ;
                IR_108:resolution = 3000.403165817 ;
                IR_108:sensor = "seviri" ;
                IR_108:standard_name = "toa_brightness_temperature" ;
                IR_108:start_time = "2017-07-01 12:45:00" ;
                IR_108:time_parameters = "{\"nominal_start_time\": \"2017-07-01 12:45:00\", \"nominal_end_time\": \"2017-07-01 13:00:00\", \"observation_start_time\": \"2017-07-01 12:45:10.354000\", \"observation_end_time\": \"2017-07-01 12:57:41.044000\"}" ;
                IR_108:units = "K" ;
                string IR_108:wavelength = "10.8 µm (9.8-11.8 µm)" ;
                IR_108:coordinates = "VIS006_acq_time longitude IR_108_acq_time latitude" ;
        double VIS006_acq_time(y) ;
                VIS006_acq_time:_FillValue = NaN ;
                VIS006_acq_time:long_name = "Mean scanline acquisition time" ;
                VIS006_acq_time:units = "milliseconds since 2017-07-01 12:45:22.975000" ;
                VIS006_acq_time:calendar = "proleptic_gregorian" ;
        float VIS006(y, x) ;
                VIS006:_FillValue = NaNf ;
                VIS006:calibration = "reflectance" ;
                VIS006:end_time = "2017-07-01 13:00:00" ;
                VIS006:georef_offset_corrected = "false" ;
                VIS006:grid_mapping = "msg_seviri_fes_3km" ;
                VIS006:modifiers = "" ;
                VIS006:nominal_end_time = "2017-07-01 13:00:00" ;
                VIS006:nominal_start_time = "2017-07-01 12:45:00" ;
                VIS006:orbital_parameters = "{\"projection_longitude\": 0.0, \"projection_latitude\": 0.0, \"projection_altitude\": 35785831.0, \"satellite_nominal_longitude\": 0.0, \"satellite_nominal_latitude\": 0.0, \"satellite_actual_longitude\": -0.14103548610869715, \"satellite_actual_latitude\": 0.7857524055230543, \"satellite_actual_altitude\": 35779085.35822027}" ;
                VIS006:platform_name = "Meteosat-10" ;
                VIS006:reader = "seviri_l1b_hrit" ;
                VIS006:resolution = 3000.403165817 ;
                VIS006:sensor = "seviri" ;
                VIS006:standard_name = "toa_bidirectional_reflectance" ;
                VIS006:start_time = "2017-07-01 12:45:00" ;
                VIS006:sun_earth_distance_correction_applied = "true" ;
                VIS006:sun_earth_distance_correction_factor = 1.01666136679324 ;
                VIS006:time_parameters = "{\"nominal_start_time\": \"2017-07-01 12:45:00\", \"nominal_end_time\": \"2017-07-01 13:00:00\", \"observation_start_time\": \"2017-07-01 12:45:10.354000\", \"observation_end_time\": \"2017-07-01 12:57:41.044000\"}" ;
                VIS006:units = "%" ;
                string VIS006:wavelength = "0.635 µm (0.56-0.71 µm)" ;
                VIS006:coordinates = "VIS006_acq_time longitude IR_108_acq_time latitude" ;

// global attributes:
                :history = "Created by pytroll/satpy on 2023-10-24 07:52:07.913676" ;
                :Conventions = "CF-1.7" ;
}
pnuu commented 1 year ago

The wavelength units look wrong. In the error message of the load it is unit='µm' and in the ncdump output string IR_108:wavelength = "10.8 µm (9.8-11.8 µm)" ;.

pnuu commented 1 year ago

Tried with some SEVIRI HRIT data I have:

In [9]: scn['IR_108'].attrs['wavelength']
Out[9]: WavelengthRange(min=9.8, central=10.8, max=11.8, unit='µm')

And after saving ncdump -h shows

string IR_108:wavelength = "10.8 µm (9.8-11.8 µm)" ;

Now, what would cause the wavelength unit go wrong in your case?

johanna-mayer commented 1 year ago

I think the "Â" character might be an issue of my terminal with displaying special characters and not necessarily a problem of how the file is saved. I also tried reading in the saved netCDF file using xarray and it works fine:

import xarray as xr
with xr.open_dataset(os.path.join('/tmp', filename)) as ds:
    ds.load()

here I can access the variables normally. E.g. if I print ds.IR_108 in a notebook everything looks fine: grafik

djhoese commented 1 year ago

Yes, that is exactly what I ran. Are format strings problematic for the writer?

Sorry, I saw the format parameters/fields in the string but missed that it was an f-string and those were being formatted immediately. I thought you were somehow giving a filename with {start_time:%Y...} to the Scene to load from and was very confused as that file shouldn't exist.

Anyway...

it looks like the satpy_cf_nc reader does not parse/read the JSON from the time_parameters key. Only orbital_parameters:

https://github.com/pytroll/satpy/blob/bc32c9434815365d180b1c6d38c00b2b6b4d5f7e/satpy/readers/satpy_cf_nc.py#L316-L317

I'm not sure who last worked on this.

johanna-mayer commented 1 year ago

Ah thank you, I think that was it! If I delete the time_parameters attribute before saving the scene as netCDF I can read the file in with the satpy_cf_nc reader.

djhoese commented 1 year ago

Looks like @pnuu added this orbital_parameters handling in #1666. Thoughts @pnuu?

pnuu commented 1 year ago

Seems that the comments by @sfinkens and me in #1666 are finally needed: parse everything starting with { to a dict.

gerritholl commented 11 months ago

MCVE based on FCI:

import hdf5plugin
from satpy import Scene
from glob import glob
from satpy.utils import debug_on; debug_on()
filenames = glob("/media/nas/x23352/MTG/FCI/L1c/2023/12/04/W_XX-EUMETSAT-Darmstadt,IMG+SAT,MTI1+FCI-1C-RRAD-FDHSI-FD--CHK-BODY--DIS-NC4E_C_EUMT_20231204*_IDPFI_OPE_20231204*_20231204*_N_JLS_C_0050_*.nc")
sc = Scene(filenames=filenames, reader="fci_l1c_nc")
sc.load(["vis_06"], calibration="counts")
sc.save_dataset(
        "vis_06",
        "/media/nas/x21308/scratch/FCI/{platform_name}-{sensor}-{start_time:%Y%m%d%H%M%S}-{end_time:%Y%m%d%H%M%S}.nc",
        encoding={"vis_06": {"dtype": "int16", "zlib": True}},
        include_lonlats=False)
sc2 = Scene(filenames=["/media/nas/x21308/scratch/FCI/MTG-I1-fci-20231204081000-20231204082000.nc"], reader=["satpy_cf_nc"])
sc2.load(["vis_06"])
djhoese commented 11 months ago

@gerritholl does this work with #2688?

sfinkens commented 11 months ago

Yes :slightly_smiling_face: (Gerrit shared the sample data with me)