Alexander-Barth / NCDatasets.jl

Load and create NetCDF files in Julia
MIT License
147 stars 29 forks source link

Default missing values #163

Closed timhultberg closed 2 years ago

timhultberg commented 2 years ago

I have been given a netcdf file with a variable stored as short. Scale factor and offset attributes are given but no _FillValue nor missing_value attribute is present. Some applications (like ncdump) treats values of -32767 (the default missing value of short) as missing values, but NCDatasets does not - it uses the usual transformation to get a value.

Sure this is a matter of conventions, but the fact that ncdump sees this as missing makes me think that this is the "official" convention. Besides it would be very convenient for me if NCDatasets did the same :-)

Alexander-Barth commented 2 years ago

Can you post the output of "ncdump -h" here ? Maybe it contains the attribute valid_min, valid_max, or valid_range.

timhultberg commented 2 years ago

output of ncdump -h below. Take latitude as an example, it does not seem to have valid_min,max,nor range

netcdf W_XX-EUMETSAT-Darmstadt\,SND+SAT\,MTS1+IRS-1B-PC--Q4--CHK-BODY---NC4E_C_EUMT_20220114083524_IDPFS_DEV_20160315125620_20160315125630_N__T_0049_0069 { types: byte enum boolean {false = 0, true = 1} ; byte enum trilean {false = 0, true = 1, undefined = 2} ; dimensions: index = 1 ; scalar = 1 ; variables: ushort index(index) ; index:long_name = "Coordinate variable of indices drived from repeat cycle data vectors" ; index:_Unsigned = "true" ; double time(index) ; time:title = "UTC time for geometric data vectors" ; time:long_name = "UTC time at which the geometric metadata are calculated." ; time:standard_name = "time" ; time:units = "seconds since 2000-01-01 00:00:00.0" ; time:_FillValue = 9.96920996838687e+36 ; ushort index_offset ; index_offset:long_name = "Offset index for data vectors" ; index_offset:_Unsigned = "true" ;

// global attributes: :Conventions = "CF-1.7, Unidata Dataset Discovery v1.3" ; :title = "W_XX-EUMETSAT-Darmstadt,SND+SAT,MTS1+IRS-1B-PC--Q4--CHK-BODY---NC4E_C_EUMT_20220114083524_IDPFS_DEV_20160315125620_20160315125630_N__T_0049_0069" ; :institution = "EUMETSAT" ; :source = "satellite sounder" ; :summary = "Infra-red Sounder (IRS) Level 1b Principal Component dataset - body data chunk" ; :history = "original generated file" ; :comment = "" ; :product_id = "IRS-1B-PC-x-Qn" ; :location_indicator = "XX-EUMETSAT-DARMSTADT" ; :data_designator = "SND+SAT" ; :platform = "MTS1" ; :data_source = "IRS" ; :processing_level = "1B" ; :type = "PC" ; :subtype = "" ; :coverage = "Q4" ; :subsetting = "" ; :component1 = "CHK" ; :component2 = "BODY" ; :component3 = "" ; :purpose = "" ; :format = "NC4E" ; :date_created = "20220114083524" ; :facility_or_tool = "IDPFS" ; :environment = "DEV" ; :time_coverage_start = "20160315125620" ; :time_coverage_end = "20160315125630" ; :processing_mode = "N" ; :special_compression = "" ; :disposition_mode = "T" ; :repeat_cycle_in_day = "0049" ; :count_in_repeat_cycle = "0069" ; :keywords_vocabulary = "" ; :keywords = "MTG IRS Principal Components" ; :id = "" ; :naming_authority = "" ; :creator_type = "institution" ; :creator_institution = "EUMETSAT" ; :creator_name = "" ; :creator_email = "" ; :creator_url = "" ; :project = "MTG" ; :cdm_datatype = "Grid" ; :references = "www.eumetsat.int" ; :license = "None" ; :standard_name_vocabulary = "" ; :time_coverage_duration = "10s" ; :time_coverage_resolution = "N.A." ; :geospatial_lat_min = 57.7766265869141 ; :geospatial_lat_max = 80.5074462890625 ; :geospatial_lon_min = -44.2742347717285 ; :geospatial_lon_max = -5.70273923873901 ; :release_version = "Original" ; :baseline_version = "0.0.0" ; :processor_version = "0.1" ; :algorithm_version = "0.0.0" ; :format_version = "IRSL1FS 5F" ; :instrument_configuration_id = "0.0.0" ; :instrument_configuration_id_version = "0.0.0" ; :group_tag = "" ; :runtime_data = "" ; :parent_data = "" ; :linked_data = "" ; :mtg_name = "W_XX-EUMETSAT-Darmstadt,SND+SAT,MTS1+IRS-1B-PC--Q4--CHK-BODY---NC4E_C_EUMT_20220114083524_IDPFS_DEV_20160315125620_20160315125630_N__T_0049_0069.nc" ; :alternative_name = "" ; :date_time_position = "20160315120000" ; :time_position = "120000" ; :processed_count_in_repeat_cycle = "N.A." ; :subsettable_groups = "" ; :subsettable_groups_present = "" ;

group: data { types: byte enum stroke_direction_t {Forward = 0, Backward = 1} ; byte enum dwell_type_t {EV = 0, BB = 1, DS1 = 2, DS2 = 3, SA = 4} ; dimensions: dwell_row = 160 ; dwell_column = 160 ; detector_row = 480 ; detector_column = 480 ; variables: double time ; time:long_name = "Central time of dwell observation" ; time:units = "seconds since 01:01:2000 00:00:00" ; dwell_type_t dwell_type(scalar) ; dwell_type:long_name = "Dwell type" ; ushort dwell_number ; dwell_number:long_name = "Dwell number defined by scan-law" ; dwell_number:_Unsigned = "true" ; stroke_direction_t stroke_direction(scalar) ; stroke_direction:long_name = "Stroke direction for dwell acquisition" ; short latitude(dwell_row, dwell_column) ; latitude:long_name = "Geolocation latitude" ; latitude:units = "degrees_north" ; latitude:add_offset = 57.77663f ; latitude:scale_factor = 0.001420676f ; short longitude(dwell_row, dwell_column) ; longitude:long_name = "Geolocation longitude " ; longitude:units = "degrees_east" ; longitude:add_offset = -44.27423f ; longitude:scale_factor = 0.002410718f ; short satellite_azimuth(dwell_row, dwell_column) ; satellite_azimuth:long_name = "Satellite azimuth angle" ; satellite_azimuth:units = "degrees" ; satellite_azimuth:coordinates = "latitude longitude" ; satellite_azimuth:add_offset = 0.f ; satellite_azimuth:scale_factor = 0.025f ; short satellite_zenith(dwell_row, dwell_column) ; satellite_zenith:long_name = "Satellite zenith angle" ; satellite_zenith:units = "degrees" ; satellite_zenith:coordinates = "latitude longitude" ; satellite_zenith:add_offset = 0.f ; satellite_zenith:scale_factor = 0.01f ; short solar_azimuth(dwell_row, dwell_column) ; solar_azimuth:long_name = "Solar azimuth angle" ; solar_azimuth:units = "degrees" ; solar_azimuth:coordinates = "latitude longitude" ; solar_azimuth:add_offset = 0.f ; solar_azimuth:scale_factor = 0.025f ; short solar_zenith(dwell_row, dwell_column) ; solar_zenith:long_name = "Solar zenith angle" ; solar_zenith:units = "degrees" ; solar_zenith:coordinates = "latitude longitude" ; solar_zenith:add_offset = 0.f ; solar_zenith:scale_factor = 0.01f ;

group: quality { dimensions: met_laser = 3 ; variables: boolean spectral_fringe_count(met_laser) ; spectral_fringe_count:long_name = "Fringe counting error flag" ; boolean geometric_restricted_zone_earth_applicable(scalar) ; geometric_restricted_zone_earth_applicable:long_name = "geometric_restricted_zone_earth_applicable" ; boolean restricted_operations_sun_eclipse_by_moon(scalar) ; restricted_operations_sun_eclipse_by_moon:long_name = "Restricted operations due to a Sun eclipse by Moon from Satellite during the dataset " ; } // group quality

group: mwir { dimensions: wavenumber = 1079 ; variables: float spatial_sampling_distance ; spatial_sampling_distance:long_name = "Spatial sampling distance of band" ; spatial_sampling_distance:units = "kilometres" ; float start_wavenumber ; start_wavenumber:long_name = "Wavenumber of the start of the band" ; start_wavenumber:units = "m-1" ; float end_wavenumber ; end_wavenumber:long_name = "Wavenumber of the end of the band" ; end_wavenumber:units = "m-1" ; float wavenumber_step ; wavenumber_step:long_name = "Wavenumber step size between spectral components" ; wavenumber_step:units = "m-1" ; float wavenumber(wavenumber) ; wavenumber:long_name = "Wavenumbers for band spectra" ; int start_number ; start_number:long_name = "First spectral channel for the band (2653 for MW and 1127 for LW)" ; int end_number ; end_number:long_name = "Last spectral channel for the band (3608 for MW and 2005 for LW)" ; float maximum_opd ; maximum_opd:long_name = "Maximum Optical Path Difference for the band" ;

group: compressed {
  dimensions:
    global_pc = 150 ;
    local_pc = 5 ;
  variables:
    string global_pc_file ;
    int global_pc_scores(dwell_row, dwell_column, global_pc) ;
        global_pc_scores:long_name = "Global Principal Component Scores" ;
        global_pc_scores:coordinates = "latitude longitude" ;
        global_pc_scores:add_offset = 0.f ;
        global_pc_scores:scale_factor = 0.5f ;
    boolean global_pc_comp_err(scalar) ;
    float global_pcr_scores(dwell_row, dwell_column) ;
        global_pcr_scores:long_name = "Global Principal Component reconstruction scores" ;
        global_pcr_scores:coordinates = "latitude longitude" ;
    boolean global_pcrs_quality(dwell_row, dwell_column) ;
        global_pcrs_quality:long_name = "Quality metric for global principal component reconstruction scores" ;
        global_pcrs_quality:coordinates = "latitude longitude" ;
    int local_pc_scores(dwell_row, dwell_column, local_pc) ;
        local_pc_scores:long_name = "Local principal component scores" ;
        local_pc_scores:coordinates = "latitude longitude" ;
        local_pc_scores:add_offset = 0.f ;
        local_pc_scores:scale_factor = 1.f ;
    boolean local_pc_comp_err(scalar) ;
    float local_eigenvalues(local_pc) ;
        local_eigenvalues:long_name = "Local principal component Eigenvalues" ;
    float local_pcr_scores(dwell_row, dwell_column) ;
        local_pcr_scores:long_name = "Local principal component reconstruction scores" ;
        local_pcr_scores:coordinates = "latitude longitude" ;
    float residual_energy ;
        residual_energy:long_name = "Quality metric derived from covariance and mean of noise normalised radiance residuals" ;
    float local_pcr_operator(local_pc, wavenumber) ;
        local_pcr_operator:long_name = "Local principal component reconstruction operator" ;
    short dc_image_radiance(detector_row, detector_column) ;
        dc_image_radiance:long_name = "Calibrated DC image radiances for the dwell bands" ;
        dc_image_radiance:add_offset = 0.f ;
        dc_image_radiance:scale_factor = 1.f ;
    short spatial_sample_quality(dwell_row, dwell_column) ;
        spatial_sample_quality:valid_range = 0s, 8191s ;
        spatial_sample_quality:flag_masks = 1s, 2s, 4s, 8s, 16s, 32s, 64s, 128s, 256s, 512s, 1024s, 2048s, 4096s ;
        spatial_sample_quality:flag_meanings = "space_view,limb_view,cloudy,dust,land,saturated_detector_sample_warning,undersaturated_detector_sample_warning,noisy_spatial_detector_sample_warning,solar_straylight_warning,solar_straylight_correction_warning,geolocation_warning,nan,nan" ;
        spatial_sample_quality:long_name = "Spatial sample quality flags" ;
        spatial_sample_quality:coordinates = "latitude longitude" ;
    byte detector_sample_quality(dwell_row, dwell_column) ;
        detector_sample_quality:valid_range = 0b, 15b ;
        detector_sample_quality:flag_masks = 1b, 2b, 4b, 8b ;
        detector_sample_quality:flag_meanings = "saturated_detector_sample,undersaturated_detector_sample,noisy_detector_sample,excluded_detector_sample" ;
        detector_sample_quality:long_name = "Detector sample quality flags" ;
        detector_sample_quality:coordinates = "latitude longitude" ;
  } // group compressed

group: quality_band {
  variables:
    boolean rad_rts_noise(dwell_row, dwell_column) ;
        rad_rts_noise:long_name = "RTS noise flag" ;
        rad_rts_noise:coordinates = "latitude longitude" ;
    boolean rad_non_linearity(scalar) ;
        rad_non_linearity:long_name = "Non Linearity correction flag" ;
    boolean spectral_stability(scalar) ;
        spectral_stability:long_name = "Spectral stability flag" ;
    boolean phase_warning(scalar) ;
        phase_warning:long_name = "Phase warning flag" ;
    boolean spectral_scale_uncertainty_warning(scalar) ;
        spectral_scale_uncertainty_warning:long_name = "Spectral scale predictor uncertainty warning flag" ;
    int number_of_expected_earthview_spatial_samples ;
        number_of_expected_earthview_spatial_samples:long_name = "Number of expected Earth spatial samples" ;
    int number_of_earthview_spatial_samples ;
        number_of_earthview_spatial_samples:long_name = "Number of Earth spatial samples" ;
    int number_of_spaceview_spatial_samples ;
        number_of_spaceview_spatial_samples:long_name = "Number of masked spatial samples" ;
    int number_of_missing_spatial_samples ;
        number_of_missing_spatial_samples:long_name = "Number of missing spatial samples" ;
    int number_of_saturation_warning_spatial_samples ;
        number_of_saturation_warning_spatial_samples:long_name = "Number of Earth spatial samples with saturated_detector_sample_warning flag set" ;
    int number_of_excluded_detector_sample_warning_spatial_samples ;
        number_of_excluded_detector_sample_warning_spatial_samples:long_name = "Number of Earth spatial samples with excluded_detector_sample_warning set" ;
    int number_of_missing_detector_sample_warning_spatial_samples ;
        number_of_missing_detector_sample_warning_spatial_samples:long_name = "Number of Earth spatial samples with missing_spatial_detector_sample_warning flag set" ;
    int number_noisy_detector_sample_warning_spatial_samples ;
        number_noisy_detector_sample_warning_spatial_samples:long_name = "Number of Earth spatial samples with noisy_detector_sample_warning flag set" ;
    int number_of_straylight_warning_spatial_samples ;
        number_of_straylight_warning_spatial_samples:long_name = "Number of Earth spatial samples with solar_straylight_warning flag set" ;
    int number_of_straylight_correction_warning_spatial_samples ;
        number_of_straylight_correction_warning_spatial_samples:long_name = "Number of Earth spatial samples with straylight_correction_warning flag set" ;
    int number_of_cloudy_spatial_samples ;
        number_of_cloudy_spatial_samples:long_name = "Number of Earth spatial samples with cloudy flag set" ;
  } // group quality_band
} // group mwir

group: lwir { dimensions: wavenumber = 881 ; variables: float spatial_sampling_distance ; spatial_sampling_distance:long_name = "Spatial sampling distance of band" ; spatial_sampling_distance:units = "kilometres" ; float start_wavenumber ; start_wavenumber:long_name = "Wavenumber of the start of the band" ; start_wavenumber:units = "m-1" ; float end_wavenumber ; end_wavenumber:long_name = "Wavenumber of the end of the band" ; end_wavenumber:units = "m-1" ; float wavenumber_step ; wavenumber_step:long_name = "Wavenumber step size between spectral components" ; wavenumber_step:units = "m-1" ; float wavenumber(wavenumber) ; wavenumber:long_name = "Wavenumbers for band spectra" ; int start_number ; start_number:long_name = "First spectral channel for the band (2653 for MW and 1127 for LW)" ; int end_number ; end_number:long_name = "Last spectral channel for the band (3608 for MW and 2005 for LW)" ; float maximum_opd ; maximum_opd:long_name = "Maximum Optical Path Difference for the band" ;

group: compressed {
  dimensions:
    global_pc = 150 ;
    local_pc = 5 ;
  variables:
    string global_pc_file ;
    int global_pc_scores(dwell_row, dwell_column, global_pc) ;
        global_pc_scores:long_name = "Global Principal Component Scores" ;
        global_pc_scores:coordinates = "latitude longitude" ;
        global_pc_scores:add_offset = 0.f ;
        global_pc_scores:scale_factor = 0.5f ;
    boolean global_pc_comp_err(scalar) ;
    float global_pcr_scores(dwell_row, dwell_column) ;
        global_pcr_scores:long_name = "Global Principal Component reconstruction scores" ;
        global_pcr_scores:coordinates = "latitude longitude" ;
    boolean global_pcrs_quality(dwell_row, dwell_column) ;
        global_pcrs_quality:long_name = "Quality metric for global principal component reconstruction scores" ;
        global_pcrs_quality:coordinates = "latitude longitude" ;
    int local_pc_scores(dwell_row, dwell_column, local_pc) ;
        local_pc_scores:long_name = "Local principal component scores" ;
        local_pc_scores:coordinates = "latitude longitude" ;
        local_pc_scores:add_offset = 0.f ;
        local_pc_scores:scale_factor = 1.f ;
    boolean local_pc_comp_err(scalar) ;
    float local_eigenvalues(local_pc) ;
        local_eigenvalues:long_name = "Local principal component Eigenvalues" ;
    float local_pcr_scores(dwell_row, dwell_column) ;
        local_pcr_scores:long_name = "Local principal component reconstruction scores" ;
        local_pcr_scores:coordinates = "latitude longitude" ;
    float residual_energy ;
        residual_energy:long_name = "Quality metric derived from covariance and mean of noise normalised radiance residuals" ;
    float local_pcr_operator(local_pc, wavenumber) ;
        local_pcr_operator:long_name = "Local principal component reconstruction operator" ;
    short dc_image_radiance(detector_row, detector_column) ;
        dc_image_radiance:long_name = "Calibrated DC image radiances for the dwell bands" ;
        dc_image_radiance:add_offset = 0.f ;
        dc_image_radiance:scale_factor = 1.f ;
    short spatial_sample_quality(dwell_row, dwell_column) ;
        spatial_sample_quality:valid_range = 0s, 8191s ;
        spatial_sample_quality:flag_masks = 1s, 2s, 4s, 8s, 16s, 32s, 64s, 128s, 256s, 512s, 1024s, 2048s, 4096s ;
        spatial_sample_quality:flag_meanings = "space_view,limb_view,cloudy,dust,saturated_detector_sample_warning,undersaturated_detector_sample_warning,noisy_detector_sample_warning,solar_straylight_warning,solar_straylight_correction_warning,nan,nan,nan,nan" ;
        spatial_sample_quality:long_name = "Spatial sample quality flags" ;
        spatial_sample_quality:coordinates = "latitude longitude" ;
    byte detector_sample_quality(dwell_row, dwell_column) ;
        detector_sample_quality:valid_range = 0b, 15b ;
        detector_sample_quality:flag_masks = 1b, 2b, 4b, 8b ;
        detector_sample_quality:flag_meanings = "saturated_detector_sample,undersaturated_detector_sample,noisy_detector_sample,excluded_detector_sample" ;
        detector_sample_quality:long_name = "Detector sample quality flags" ;
        detector_sample_quality:coordinates = "latitude longitude" ;
  } // group compressed

group: quality_band {
  variables:
    boolean rad_rts_noise(dwell_row, dwell_column) ;
        rad_rts_noise:long_name = "RTS noise flag" ;
        rad_rts_noise:coordinates = "latitude longitude" ;
    boolean rad_non_linearity(scalar) ;
        rad_non_linearity:long_name = "Non Linearity correction flag" ;
    boolean spectral_stability(scalar) ;
        spectral_stability:long_name = "Spectral stability flag" ;
    boolean phase_warning(scalar) ;
        phase_warning:long_name = "Phase warning flag" ;
    boolean spectral_scale_uncertainty_warning(scalar) ;
        spectral_scale_uncertainty_warning:long_name = "Spectral scale predictor uncertainty warning flag" ;
    int number_of_expected_earthview_spatial_samples ;
        number_of_expected_earthview_spatial_samples:long_name = "Number of expected Earth spatial samples" ;
    int number_of_earthview_spatial_samples ;
        number_of_earthview_spatial_samples:long_name = "Number of Earth spatial samples" ;
    int number_of_spaceview_spatial_samples ;
        number_of_spaceview_spatial_samples:long_name = "Number of masked spatial samples" ;
    int number_of_missing_spatial_samples ;
        number_of_missing_spatial_samples:long_name = "Number of missing spatial samples" ;
    int number_of_saturation_warning_spatial_samples ;
        number_of_saturation_warning_spatial_samples:long_name = "Number of Earth spatial samples with saturated_detector_sample_warning flag set" ;
    int number_of_excluded_detector_sample_warning_spatial_samples ;
        number_of_excluded_detector_sample_warning_spatial_samples:long_name = "Number of Earth spatial samples with excluded_detector_sample_warning set" ;
    int number_of_missing_detector_sample_warning_spatial_samples ;
        number_of_missing_detector_sample_warning_spatial_samples:long_name = "Number of Earth spatial samples with missing_spatial_detector_sample_warning flag set" ;
    int number_noisy_detector_sample_warning_spatial_samples ;
        number_noisy_detector_sample_warning_spatial_samples:long_name = "Number of Earth spatial samples with noisy_detector_sample_warning flag set" ;
    int number_of_straylight_warning_spatial_samples ;
        number_of_straylight_warning_spatial_samples:long_name = "Number of Earth spatial samples with solar_straylight_warning flag set" ;
    int number_of_straylight_correction_warning_spatial_samples ;
        number_of_straylight_correction_warning_spatial_samples:long_name = "Number of Earth spatial samples with straylight_correction_warning flag set" ;
    int number_of_cloudy_spatial_samples ;
        number_of_cloudy_spatial_samples:long_name = "Number of Earth spatial samples with cloudy flag set" ;
  } // group quality_band
} // group lwir

} // group data

group: state {

group: processor { types: byte enum auxiliary_dataset_status_type {OK = 0, out_of_validity_time = 1, not_available = 2} ; dimensions: auxiliary_dataset = 1 ; variables: string auxiliary_dataset_identifier(auxiliary_dataset) ; auxiliary_dataset_status_type auxiliary_dataset_status(auxiliary_dataset) ; string principal_components_version ; double last_spectral_calibration_time ; last_spectral_calibration_time:long_name = "Time of last spectral calibration" ; last_spectral_calibration_time:standard_name = "time" ; last_spectral_calibration_time:units = "seconds since 01:01:2000 00:00:00" ; last_spectral_calibration_time:precision = "1 millisecond" ; } // group processor

group: platform { types: byte enum manoeuvre_type {None = 0, NSSK = 1, EWSK = 2, SR = 3, MU = 4} ; byte enum reference_frame_type {undefined = 0, GCRF = 1, EME2000 = 2, ITRF2008 = 3, TDR = 4, TEME = 5, TOD = 6, RTN = 7} ; byte enum yaw_flip_type {winter = 0, summer = 1} ; variables: yaw_flip_type yaw_flip(scalar) ; boolean in_manoeuvre(scalar) ; in_manoeuvre:title = "Platform manoeuvre occurs in this dataset when set" ; double recent_manoeuvre_time_window ; recent_manoeuvre_time_window:_FillValue = 9.96920996838687e+36 ; recent_manoeuvre_time_window:title = "Time window to search for a manoeuvre that starts before or during this dataset" ; recent_manoeuvre_time_window:long_name = "Recent manoeuvre time window" ; recent_manoeuvre_time_window:units = "seconds" ; boolean recent_maneouvre_found(scalar) ; recent_maneouvre_found:long_name = "Recent or current manoeuvre found" ; recent_maneouvre_found:title = "Recent or current manoeuvre found in the recent manoeuvre time window" ; manoeuvre_type recent_manoeuvre_type(scalar) ; recent_manoeuvre_type:long_name = "Type of recent manoeuvre" ; double recent_manoeuvre_start_time ; recent_manoeuvre_start_time:units = "seconds since 2000-01-01 00:00:00.0" ; recent_manoeuvre_start_time:precision = "1 millisecond" ; recent_manoeuvre_start_time:standard_name = "time" ; recent_manoeuvre_start_time:long_name = "Start time in UTC of recent manoeuvre" ; recent_manoeuvre_start_time:_FillValue = 9.96920996838687e+36 ; double recent_manoeuvre_end_time ; recent_manoeuvre_end_time:long_name = "End time in UTC of recent manoeuvre" ; recent_manoeuvre_end_time:standard_name = "time" ; recent_manoeuvre_end_time:units = "seconds since 2000-01-01 00:00:00.0" ; recent_manoeuvre_end_time:precision = "1 millisecond" ; recent_manoeuvre_end_time:_FillValue = 9.96920996838687e+36 ; reference_frame_type recent_manoeuvre_reference_frame(scalar) ; recent_manoeuvre_reference_frame:long_name = "Reference frame for manoeuvre paramaters" ; double recent_manoeuvre_delta_vx ; recent_manoeuvre_delta_vx:long_name = "X component delta v for recent manoeuvre" ; recent_manoeuvre_delta_vx:units = "m/s" ; recent_manoeuvre_delta_vx:_FillValue = 9.96920996838687e+36 ; double recent_manoeuvre_delta_vy ; recent_manoeuvre_delta_vy:long_name = "Y component delta v for recent manoeuvre" ; recent_manoeuvre_delta_vy:units = "m/s" ; recent_manoeuvre_delta_vy:_FillValue = 9.96920996838687e+36 ; double recent_manoeuvre_delta_vz ; recent_manoeuvre_delta_vz:long_name = "Z component delta v for recent manoeuvre" ; recent_manoeuvre_delta_vz:units = "m/s" ; recent_manoeuvre_delta_vz:_FillValue = 9.96920996838687e+36 ; double recent_manoeuvre_spacecraft_delta_mass ; recent_manoeuvre_spacecraft_delta_mass:long_name = "Delta spacecraft mass for recent manoeuvre" ; recent_manoeuvre_spacecraft_delta_mass:units = "g" ; recent_manoeuvre_spacecraft_delta_mass:_FillValue = 9.96920996838687e+36 ; double upcoming_manoeuvre_time_window ; upcoming_manoeuvre_time_window:title = "Time window to search for a manoeuvre that starts after this dataset" ; upcoming_manoeuvre_time_window:long_name = "Upcoming manoeuvre time window" ; upcoming_manoeuvre_time_window:units = "seconds" ; upcoming_manoeuvre_time_window:_FillValue = 9.96920996838687e+36 ; boolean upcoming_maneouvre_found(scalar) ; upcoming_maneouvre_found:long_name = "Upcoming manoeuvre found" ; upcoming_maneouvre_found:title = "Upcoming manoeuvre found in the upcoming manoeuvre time window" ; manoeuvre_type upcoming_manoeuvre_type(scalar) ; upcoming_manoeuvre_type:long_name = "Type of upcoming manoeuvre" ; double upcoming_manoeuvre_start_time ; upcoming_manoeuvre_start_time:long_name = "Start time in UTC of upcoming manoeuvre" ; upcoming_manoeuvre_start_time:standard_name = "time" ; upcoming_manoeuvre_start_time:units = "seconds since 2000-01-01 00:00:00.0" ; upcoming_manoeuvre_start_time:precision = "1 millisecond" ; upcoming_manoeuvre_start_time:_FillValue = 9.96920996838687e+36 ; double upcoming_manoeuvre_end_time ; upcoming_manoeuvre_end_time:long_name = "End time in UTC of upcoming manoeuvre" ; upcoming_manoeuvre_end_time:standard_name = "time" ; upcoming_manoeuvre_end_time:units = "seconds since 2000-01-01 00:00:00.0" ; upcoming_manoeuvre_end_time:precision = "1 millisecond" ; upcoming_manoeuvre_end_time:_FillValue = 9.96920996838687e+36 ; reference_frame_type upcoming_manoeuvre_reference_frame(scalar) ; upcoming_manoeuvre_reference_frame:long_name = "Reference frame for manoeuvre paramaters" ; double upcoming_manoeuvre_delta_vx ; upcoming_manoeuvre_delta_vx:long_name = "X component delta v for upcoming manoeuvre" ; upcoming_manoeuvre_delta_vx:units = "m/s" ; upcoming_manoeuvre_delta_vx:_FillValue = 9.96920996838687e+36 ; double upcoming_manoeuvre_delta_vy ; upcoming_manoeuvre_delta_vy:long_name = "Y component delta v for upcoming manoeuvre" ; upcoming_manoeuvre_delta_vy:units = "m/s" ; upcoming_manoeuvre_delta_vy:_FillValue = 9.96920996838687e+36 ; double upcoming_manoeuvre_delta_vz ; upcoming_manoeuvre_delta_vz:long_name = "Z component delta v for upcoming manoeuvre" ; upcoming_manoeuvre_delta_vz:units = "m/s" ; upcoming_manoeuvre_delta_vz:_FillValue = 9.96920996838687e+36 ; double upcoming_manoeuvre_spacecraft_delta_mass ; upcoming_manoeuvre_spacecraft_delta_mass:long_name = "Delta spacecraft mass for upcoming manoeuvre" ; upcoming_manoeuvre_spacecraft_delta_mass:units = "g" ; upcoming_manoeuvre_spacecraft_delta_mass:_FillValue = 9.96920996838687e+36 ; float subsatellite_latitude(index) ; subsatellite_latitude:long_name = "Sub-satellite latitude" ; subsatellite_latitude:units = "degrees_north" ; subsatellite_latitude:standard_name = "latitude" ; subsatellite_latitude:_FillValue = 9.96921e+36f ; float subsatellite_longitude(index) ; subsatellite_longitude:long_name = "Sub-satellite longitude" ; subsatellite_longitude:units = "degrees_east" ; subsatellite_longitude:standard_name = "longitude" ; subsatellite_longitude:_FillValue = 9.96921e+36f ; float platform_altitude(index) ; platform_altitude:long_name = "Platform altitude" ; platform_altitude:units = "m" ; platform_altitude:_FillValue = 9.96921e+36f ; float orbit_phase(index) ; orbit_phase:long_name = "Orbit phase" ; orbit_phase:units = "degrees" ; orbit_phase:_FillValue = 9.96921e+36f ; } // group platform

group: celestial { variables: float solar_azimuth(index) ; solar_azimuth:standard_name = "solar_azimuth_angle" ; solar_azimuth:long_name = "Solar azimuth ange" ; solar_azimuth:units = "degree" ; solar_azimuth:_FillValue = 9.96921e+36f ; float subsolar_longitude(index) ; subsolar_longitude:long_name = "Sub-solar longitude" ; subsolar_longitude:units = "degrees_east" ; subsolar_longitude:standard_name = "longitude" ; subsolar_longitude:valid_range = -90.f, 90.f ; subsolar_longitude:_FillValue = 9.96921e+36f ; float subsolar_latitude(index) ; subsolar_latitude:long_name = "Sub-solar latitude" ; subsolar_latitude:units = "degrees_north" ; subsolar_latitude:standard_name = "latitude" ; subsolar_latitude:_FillValue = 9.96921e+36f ; float solar_elevation(index) ; solar_elevation:standard_name = "solar_elevation_angle" ; solar_elevation:long_name = "Solar elevation angle" ; solar_elevation:units = "degree" ; solar_elevation:_FillValue = 9.96921e+36f ; float earth_sun_distance(index) ; earth_sun_distance:long_name = "Distance between Earth and Sun" ; earth_sun_distance:units = "km" ; earth_sun_distance:_FillValue = 9.96921e+36f ; float sun_satellite_distance(index) ; sun_satellite_distance:long_name = "Distance between satellite and Sun" ; sun_satellite_distance:units = "km" ; sun_satellite_distance:_FillValue = 9.96921e+36f ; boolean sun_eclipse_by_earth(index) ; sun_eclipse_by_earth:long_name = "Sun eclipsed by Earth" ; sun_eclipse_by_earth:titile = "If TRUE indicates an eclipse of the Sun by the Earth, as viewed by the satellite" ; boolean sun_eclipse_by_moon(index) ; sun_eclipse_by_moon:long_name = "Sun eclipsed by Moon" ; sun_eclipse_by_moon:titile = "If TRUE indicates an eclipse of the Sun by the Moon as viewed by the satellite" ; } // group celestial

group: instrument { variables: double repeat_cycle_start_time ; repeat_cycle_start_time:long_name = "Repeat cycle start time" ; repeat_cycle_start_time:units = "seconds since 2000-01-01 00:00:00.0" ; int repeat_sequence_counter ; repeat_sequence_counter:long_name = "Repeat sequence counter”" ; ushort repeat_sequence_id ; repeat_sequence_id:long_name = "Repeat sequence ID”" ; repeat_sequence_id:_Unsigned = "true" ; ushort repeat_cycles_in_repeat_sequence ; repeat_cycles_in_repeat_sequence:long_name = "Number of repeat cycles in current repeat sequence”" ; repeat_cycles_in_repeat_sequence:_Unsigned = "true" ; ushort repeat_cycle_counter_in_repeat_sequence ; repeat_cycle_counter_in_repeat_sequence:long_name = "Repeat Cycle position in current repeat sequence”" ; repeat_cycle_counter_in_repeat_sequence:_Unsigned = "true" ; int repeat_cycle_id_in_repeat_sequence ; repeat_cycle_id_in_repeat_sequence:long_name = "Repeat cycle ID in current repeat sequence”" ; byte repeat_cycle_lac_id ; repeat_cycle_lac_id:long_name = "Repeat cycle type" ; boolean mwir_band_on(scalar) ; mwir_band_on:long_name = "Status of the MWIR detection chain" ; boolean lwir_band_on(scalar) ; lwir_band_on:long_name = "Status of the LWIR detection chain" ; double last_decontamination_start_time ; last_decontamination_start_time:units = "seconds since 2000-01-01 00:00:00.0" ; last_decontamination_start_time:_FillValue = 9.96920996838687e+36 ; last_decontamination_start_time:long_name = "Start time of last decontamination" ; double last_decontamination_end_time ; last_decontamination_end_time:_FillValue = 9.96920996838687e+36 ; last_decontamination_end_time:long_name = "End time of last decontamination" ; last_decontamination_end_time:units = "seconds since 2000-01-01 00:00:00.0" ; double last_detection_chain_parameter_change_time ; last_detection_chain_parameter_change_time:_FillValue = 9.96920996838687e+36 ; last_detection_chain_parameter_change_time:long_name = "Time of the last change in detection chain parameters" ; } // group instrument } // group state }

Alexander-Barth commented 2 years ago

I can confirm the behavior of ncdump:

ncgen -o test.nc << EOF
netcdf test {
dimensions:
    dim = 3 ;
variables:
    short var(dim) ;
data:
    var = -32767,  -32767,  -32767 ;
}
EOF

output of ncdump test.nc:

netcdf test {
dimensions:
    dim = 3 ;
variables:
    short var(dim) ;
data:

 var = _, _, _ ;
}

But python-xarray (version 0.21.1) does not mask these values:

import xarray as xr
xr.open_dataset("test.nc")["var"].to_numpy()
# output:  array([-32767, -32767, -32767], dtype=int16)

I am not sure what is correct here.

(for readability can you limit the output of ncdump -h to the variable in question, thanks!)

timhultberg commented 2 years ago

I am not sure what is correct either, but for sure the dataprovider meant these values to be masked, and I have sympathy with the idea of being able to use a default missing value.

extract of ncdump -h short latitude(dwell_row, dwell_column) ; latitude:long_name = "Geolocation latitude" ; latitude:units = "degrees_north" ; latitude:add_offset = 57.77663f ; latitude:scale_factor = 0.001420676f ;

Alexander-Barth commented 2 years ago

If we would implement this then every access to a NetCDF variables (except for coordinate variables) would be an array of Union{Missing,Float64} instead of a Float64 for data in double precision if we want to have type-stable code. Only when you have read the data you know whether or not there is a value equal to the default fill value (but this is too late for the julia compiler). This would have an impact of memory usage and likely have an impact on speed.

timhultberg commented 2 years ago

Ok, I see. But maybe there could be a possibility to consciously opt in to the use of default missing values with the associated overhead? It is foreseen to be used in operational products from EUMETSAT - and otherwise I get a perfectly sensible geolocation for the missing pixels.

Alexander-Barth commented 2 years ago

Can you give me more background about the nature of this product EUMETSAT ? I know some people there that use xarray and I am not sure if they are aware of these complications. Many other variables use _FillValue. Maybe they just forgot to add the attribute for latitude ?

In any case, you can always get the data with:

v = ds["sla_filtered"]
data = replace(v, fillvalue(eltype(v)) => Missing) .* v.attrib["scale_factor"] .+ v.attrib["add_offset"];

But if it is part of the standard, then I should be supported in NCDatasets. Maybe this is a place to ask https://github.com/cf-convention/discuss.

timhultberg commented 2 years ago

The product contains PC compressed hyperspectral infrared radiances from the IRS instrument of the upcoming MTG programme. For now this is simulated testdata, but the format is as specified. Yes, there are other variables which uses _FillValue but they are all of floating point datatype - the variables stored with integer types all rely on the default missing value. (For the other big upcoming programme, EPS-SG, this is not the case.) The way EUMETSAT works with big industrial contracts makes it virtually impossible to change anytime, once it has been specified. Thanks for your two suggestions. I will try to find out if this is part of the standard and in the meanwhile consider the workaround.

Alexander-Barth commented 2 years ago

You can also use now this function https://alexander-barth.github.io/NCDatasets.jl/latest/dataset/#NCDatasets.cfvariable to specify a FillValue_