pytroll / satpy

Python package for earth-observing satellite data processing
http://satpy.readthedocs.org/en/latest/
GNU General Public License v3.0
1.08k stars 298 forks source link

Why?Fill value incompatible with integer data using 65535 instead. #2383

Open yhyxzx opened 2 years ago

yhyxzx commented 2 years ago

Code Sample, a minimal, complete, and verifiable piece of code

Cloud_scn = Scene(filenames=S3_name)
Cloud_scn.load(['cloud'])
new_Cloud_scn = Cloud_scn.resample(area_def_therm, resampler='nearest')

[WARNING: 2022-11-16 23:06:02 : pyresample.kd_tree] Fill value incompatible with integer data using 65535 instead.

djhoese commented 2 years ago

This is likely a limitation of the reader you're using in Satpy. What reader are you using? Can you also do print(cloud_scn['cloud'].attrs) and paste the output here?

Basically what has happened is that cloud is likely an integer dataset, but has no _FillValue attribute set. Satpy/pyresample is trying to determine what fill value it should use for the output array. It defaults to NaN, but realizes that the data isn't floating point data (NaN can't be represented by integers).

There are workarounds to get rid of the warning like setting fill_value=0 or fill_value=65535 in your .resample call, but the longer term fix is likely to update/fix the reader in Satpy to include a _FillValue attribute in the DataArray so that pyresample can use that as a default.

yhyxzx commented 2 years ago
    Cloud_scn = Scene(filenames=S3_name)
    Cloud_scn.load(['cloud'])
    print(Cloud_scn['cloud'].attrs)
    new_Cloud_scn = Cloud_scn.resample(area_def_therm, resampler='nearest')  # resample with 'nearest' neighbour (default)
{
    'name': 'cloud',
    'sensor': 'slstr',
    'resolution': 500,
    'coordinates': ('longitude', 'latitude'),
    'view': 'nadir',
    'stripe': 'a',
    'file_type': 'esa_l1b_flag',
    'file_key': 'cloud_{stripe:1s}{view:1s}',
    'modifiers': (),
    'flag_masks': array([    1,     2,     4,     8,    16,    32,    64,   128,   256,
         512,  1024,  2048,  4096,  8192, 16384, 32768], dtype=uint16),
    'flag_meanings': 'visible 1.37_threshold 1.6_small_histogram 1.6_large_histogram 2.25_small_histogram 2.25_large_histogram 11_spatial_coherence gross_cloud thin_cirrus medium_high fog_low_stratus 11_12_view_difference 3.7_11_view_difference thermal_histogram spare spare',
    'platform_name': 'Sentinel-3A',
    'start_time': datetime.datetime(2022, 10, 7, 2, 2, 26, 493747),
    'end_time': datetime.datetime(2022, 10, 7, 2, 5, 26, 185521),
    'reader': 'slstr_l1b',
    'area': <pyresample.geometry.SwathDefinition object at 0x0000029621BC90A0>,
    '_satpy_id': DataID(name='cloud', resolution=500, view=<view.nadir>, stripe=<stripe.a>, modifiers=()),
    'ancillary_variables': []
}
[DEBUG: 2022-11-17 09:04:06 : satpy.resample] Resampling cloud_an
[WARNING: 2022-11-17 09:04:06 : pyresample.kd_tree] Fill value incompatible with integer data using 65535 instead.

@djhoese

djhoese commented 2 years ago

Sorry, just to make sure, could you do print(cloud_scn['cloud'].data)? I'm curious what the dtype and shape are of the array. Overall this seems like something that could be improved in the Satpy reader (slstr_l1b in this case). The flag_meanings attribute shows this is a category product, but it doesn't have an associated fill value. Based on the flag_meanings maybe 0 would be a good default for _FillValue since it seems 0 isn't used otherwise.

@mraspaud was one of the original creators of the reader (and @simonrp84 seems to have contributed to it as well). What do you guys think about this?

We should maybe transfer this issue to the Satpy repository.

yhyxzx commented 2 years ago

@djhoese

Cloud_scn = Scene(filenames=S3_name)
Cloud_scn.load(['cloud'])
print(Cloud_scn['cloud'].data)

dask.array<open_dataset-c3404432d88569a6f37315cbacb415a9cloud_an, shape=(2400, 3000), dtype=uint16, chunksize=(2400, 3000), chunktype=numpy.ndarray>

djhoese commented 2 years ago

@yhyxzx Can you point me to some of these files?

yhyxzx commented 2 years ago

https://scihub.copernicus.eu/dhus/odata/v1/Products('0f416a40-7591-4eeb-84a6-616cfc1a610f')/$value @djhoese

djhoese commented 2 years ago

Doing an ncdump -h on the flags file I see:

        ushort cloud_an(rows, columns) ;
                cloud_an:flag_masks = 1US, 2US, 4US, 8US, 16US, 32US, 64US, 128US, 256US, 512US, 1024US, 2048US, 4096US, 8192US, 16384US, 32768US ;
                cloud_an:flag_meanings = "visible 1.37_threshold 1.6_small_histogram 1.6_large_histogram 2.25_small_histogram 2.25_large_histogram 11_spatial_coherence gross_cloud thin_cirrus medium_high fog_low_stratus 11_12_view_difference 3.7_11_view_difference thermal_histogram spare spare" ;

So yeah these files don't really have a lot of metadata. I think the reader in Satpy could be updated to add it. Either way, @yhyxzx you can just specify fill_value=0 when you call Scene.resample and you're resampling the cloud product. Otherwise you can ignore this warning.

@mraspaud thoughts?